TraceLink Logo

TraceLink

Sr. Director, Cloud Engineering

Reposted 7 Days Ago
Be an Early Applicant
In-Office
Wilmington, MA
266K-326K Annually
Senior level
In-Office
Wilmington, MA
266K-326K Annually
Senior level
Lead Cloud Engineering at TraceLink, overseeing SRE, Performance & Tools, and Release Engineering. Drive AI initiatives and operational excellence.
The summary above was generated by AI

Company overview:

TraceLink’s software solutions and Opus Platform help the pharmaceutical industry digitize their supply chain and enable greater compliance, visibility, and decision making. It reduces disruption to the supply of medicines to patients who need them, anywhere in the world.

 

Founded in 2009 with the simple mission of protecting patients, today Tracelink has 8 offices, over 800 employees and more than 1300 customers in over 60 countries around the world. Our expanding product suite continues to protect patients and now also enhances multi-enterprise collaboration through innovative new applications such as MINT.

 

Tracelink is recognized as an industry leader by Gartner and IDC, and for having a great company culture by Comparably.

TraceLink is seeking a strategic and hands-on Senior Director of Cloud Engineering to lead a multi-disciplinary organization spanning Site Reliability Engineering (SRE), Performance & Tools Engineering, and Release Engineering. This role is critical to ensuring the scalability, reliability, and operational excellence of TraceLink’s cloud-native SaaS platform, while also owning the infrastructure behind both internal and customer-facing AI capabilities.

The Director will be the single-threaded owner of our internal suite of AI-enabled tools for engineering productivity, as well as responsible for the DevOps and infrastructure support for external AI features integrated into the Opus platform, such as LLM-powered agentic functionality.

They will drive initiatives that enable AI-powered operational intelligence, cost-optimized infrastructure, and high-velocity product delivery across a globally distributed engineering team.

 

Responsibilities:

  • Act as a Single Threaded Owner (STO) for infrastructure & operational excellence  and lead a global organization across three primary areas:

    • SRE, with an SRE Manager and team focused on reliability, observability, incident response, and cloud operations

    • Performance & Tools, building tooling for automated testing, test orchestration, system health monitoring, and integration testing

    • Release Engineering, responsible for CI/CD tooling, release orchestration, and deployment automation

  • Own and evolve TraceLink’s internal suite of AI-enabled tools designed to enhance developer productivity and platform insight

  • Play a leadership role in DevOps and infrastructure operations for AI capabilities integrated into TraceLink’s Opus platform, including support for LLM-based workflows, inference pipelines, and secure model interactions

  • Evaluate and adopt emerging technologies aligned with the company’s product vision and technical architecture

  • Partner with the CISO, architecture, and product teams to align cloud practices with security, compliance, and business goals

  • Drive maturity in infrastructure as code, observability (OpenTelemetry, Prometheus, Grafana, Jaeger), and release automation (Jenkins, Flux-CD, Env0, CodeBuild)

  • Lead the design and rollout of AI-driven anomaly detection, telemetry pipelines, and proactive system health monitoring

  • Extend CI/CD and integration testing systems to support performance testing, distributed tracing, and alerting workflows

  • Be a major contributor to efforts to improve product quality through improved automated testing

  • Champion cost optimization initiatives, including efficient AWS resource usage (Karpenter, Spot Instances, serverless), and align to target COGS metrics

  • Set high standards for reliability, latency, availability, and scalability of core systems

  • Oversee deployment health, platform smoke tests, and post-deployment validation strategies

  • Monitor and report on platform KPIs, system uptime, alerting noise ratios, and MTTR

  • Lead incident response strategies and reduction of manual toil through automation and self-service tools

  • Hire, mentor, and grow high-performing engineering managers and technical leaders

  • Align team OKRs with broader engineering and company goals

  • Foster a culture of engineering rigor, continuous improvement, and cross-functional collaboration

 

Qualifications:

Required:

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience

  • 5+ years in engineering leadership roles managing multiple cross-functional DevOps/SRE/tooling teams

  • Deep experience with cloud-native architecture, especially AWS services, infrastructure-as-code, CI/CD systems, and observability platforms

  • Proven success running SaaS at scale, including performance, reliability, and cost optimization

  • Hands-on experience with tools such as Terraform, Helm, Docker, Kubernetes, Prometheus, ELK, Redis, Kafka, Karpenter, Jenkins, OpenTelemetry, Grafana, Env0, CodeBuild

  • AWS Bedrock or equivalent managed foundation model platforms

  • Experience supporting AI/ML-enabled applications, including inference pipelines and secure LLM integration

  • Experience with high-performance inference runtimes such as KServe, vLLM, TensorRT-LLM, TGI, or Envoy AI Gateway

  • Techniques for optimizing inference performance and cost, including KV Cache management, prompt caching, model quantization, and batching strategies

  • Clear understanding of security practices, DevSecOps, and compliance (e.g., SOC-2, ISO27001)

  • Excellent communication and stakeholder management skills

Preferred:

  • Advanced degree in Engineering or related field

  • Experience with regulated industries (e.g., healthcare, pharma, or life sciences)

  • Familiarity with reactive frameworks and modern Java/JavaScript application stacks

TraceLink is committed to providing competitive compensation and benefits to all employees. This is the estimated base salary range for this role and should serve only as a guide. Final compensation offered may vary based on a variety of factors including but not limited to experience level, fit for the role, skills, domain knowledge, internal equity, budget, and location.

US Pay Range
$266,121.64$326,375.89 USD

Please see the Tracelink Privacy Policy for more information on how Tracelink processes your personal information during the recruitment process and, if applicable based on your location, how you can exercise your privacy rights. If you have questions about this privacy notice or need to contact us in connection with your personal data, including any requests to exercise your legal rights referred to at the end of this notice, please contact [email protected].  


Top Skills

AWS
Aws Bedrock
Codebuild
Docker
Elk
Env0
Envoy Ai Gateway
Grafana
Helm
Jenkins
Kafka
Karpenter
Kserve
Kubernetes
Opentelemetry
Prometheus
Redis
Tensorrt-Llm
Terraform
Tgi
Vllm

Similar Jobs

An Hour Ago
Hybrid
Framingham, MA, USA
198K-272K Annually
Expert/Leader
198K-272K Annually
Expert/Leader
Automotive • eCommerce • Hardware • Music • Retail • Software • Wearables
Lead global marketing strategy and execution for Tech Licensing and B2B, driving partner/co-marketing, brand stewardship, PR, events, sales enablement, and building a high-performing team to scale Boses enterprise licensing business.
13 Hours Ago
Hybrid
Cambridge, MA, USA
197K-246K Annually
Mid level
197K-246K Annually
Mid level
Fintech • Machine Learning • Payments • Software • Financial Services
The Lead AI Engineer will develop and deploy AI components, optimize large language models, and improve AI systems performance while collaborating with cross-functional teams.
Top Skills: AWSAzureGoGCPHuggingfaceJavaNemo GuardrailsPythonPyTorchScalaVectordbs
13 Hours Ago
Hybrid
Cambridge, MA, USA
245K-335K Annually
Senior level
245K-335K Annually
Senior level
Fintech • Machine Learning • Payments • Software • Financial Services
The Distinguished AI Engineer will develop and support AI software components, lead AI system architecture, and mentor teams, focusing on scalable AI solutions.
Top Skills: AWSAzureGoGCPHuggingfaceJavaNemo GuardrailsPythonPyTorchScalaVectordbs

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account