The Senior Site Reliability Engineer is responsible for ensuring production system reliability, scalability, and performance through automation, monitoring, and infrastructure engineering. The role includes mentoring junior engineers and managing production environments, while collaborating with engineering teams to improve system resilience.
Category/Area of Expertise: IT & Technology
Job Requisition: 487072
Address: USA-MA-Quincy-1385 Hancock Street
Store Code: Development (5157947)
A great career opportunity
Ahold Delhaize USA, a division of Netherlands-based Ahold Delhaize, is the parent company for Ahold Delhaize's U.S. companies, including its local brands, Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop, and the U.S. services companies, Retail Business Services, Peapod Digital Labs and ADUSA Supply Chain. When considered together, the local brands of Ahold Delhaize USA comprise the largest grocery retail group on the East Coast and the fourth largest grocery retail group in the nation, operating more than 2,000 stores and distribution centers across more than 20 states and serving millions of customers each week through a uniquely local omnichannel experience. The Ahold Delhaize USA company team includes just over 100 associates across all East Coast office locations.
Primary Purpose
The Site Reliability Engineer (SRE) III is responsible for ensuring the scalability, reliability, and performance of production systems through automation, observability, incident response, and infrastructure engineering. This role involves designing and implementing robust operational processes and tooling to support highly available, fault-tolerant systems in a cloud-native environment. The SRE III collaborates closely with engineering squads, product teams, and stakeholders to embed reliability best practices across the software delivery lifecycle. The role includes ownership of system uptime, service level objectives (SLOs), and operational excellence, along with mentoring junior engineers and leading cross-functional initiatives that improve system resilience.
Our flexible/hybrid work schedule includes 3 in-person days at our Chicago office and 2 remote days.
Applicants must be currently authorized to work in the United States on a full-time basis.
Duties & Responsibilities
Qualifications
Salary Range: $125,040 - $187,560
Actual compensation offered to a candidate may vary based on their unique qualifications and experience, internal equity, and market conditions. Final compensation decisions will be made in accordance with company policies and applicable laws.
#LI-KK1 #LI-Hybrid
At Ahold Delhaize USA, we embrace and celebrate diversity. Our employees and prospective employees are treated with fairness, respect and dignity. We provide an equal opportunity workplace committed to hiring, training, compensating, and promoting persons based on their talents and abilities and without regard to race, religion, color, national origin, gender, sexual orientation, age, family status, veteran status, disability status, or any other applicable characteristics protected by law.
Job Requisition: 487072
Address: USA-MA-Quincy-1385 Hancock Street
Store Code: Development (5157947)
A great career opportunity
Ahold Delhaize USA, a division of Netherlands-based Ahold Delhaize, is the parent company for Ahold Delhaize's U.S. companies, including its local brands, Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop, and the U.S. services companies, Retail Business Services, Peapod Digital Labs and ADUSA Supply Chain. When considered together, the local brands of Ahold Delhaize USA comprise the largest grocery retail group on the East Coast and the fourth largest grocery retail group in the nation, operating more than 2,000 stores and distribution centers across more than 20 states and serving millions of customers each week through a uniquely local omnichannel experience. The Ahold Delhaize USA company team includes just over 100 associates across all East Coast office locations.
Primary Purpose
The Site Reliability Engineer (SRE) III is responsible for ensuring the scalability, reliability, and performance of production systems through automation, observability, incident response, and infrastructure engineering. This role involves designing and implementing robust operational processes and tooling to support highly available, fault-tolerant systems in a cloud-native environment. The SRE III collaborates closely with engineering squads, product teams, and stakeholders to embed reliability best practices across the software delivery lifecycle. The role includes ownership of system uptime, service level objectives (SLOs), and operational excellence, along with mentoring junior engineers and leading cross-functional initiatives that improve system resilience.
Our flexible/hybrid work schedule includes 3 in-person days at our Chicago office and 2 remote days.
Applicants must be currently authorized to work in the United States on a full-time basis.
Duties & Responsibilities
- Design and implement infrastructure solutions that ensure system availability, scalability, and reliability across cloud-native environments like AKS and Kubernetes.
- Develop automation for provisioning, deployment, configuration, monitoring, and incident remediation using tools such as Terraform, ArgoCD, and GitHub Actions.
- Collaborate with engineering teams to define and track service level objectives (SLOs) and service level indicators (SLIs).
- Build and manage microservices-based platforms leveraging Spring Boot, Java, Tomcat, and Redis.
- Monitor production environments using Datadog and proactively address performance and reliability issues.
- Perform root cause analysis and lead post-incident reviews to drive continual improvement.
- Manage CI/CD pipelines and deployment automation using GitHub, Docker, and container orchestration technologies.
- Create and maintain infrastructure as code (IaC) using Terraform, with deployment pipelines integrated into GitOps workflows.
- Lead and support operational readiness reviews, game days, chaos engineering practices, and failure mode analysis.
- Build scalable observability and alerting frameworks with Datadog.
- Implement resilient, asynchronous architectures using Kafka for event-driven services.
- Reduce operational toil through self-healing automation and proactive system tuning.
- Troubleshoot Linux-based environments such as Ubuntu and optimize them for reliability.
- Provide on-call support and ensure 24/7/365 system reliability for mission-critical applications.
- Collaborate with the security team to enforce secure operational practices and cloud compliance.
- Mentor junior engineers and contribute to documentation, technical design, and knowledge-sharing across the organization.
Qualifications
- Bachelor's Degree in Computer Science, Information Systems, or a related technical field; equivalent training, certifications, or experience will be considered.
- 5+ years of experience in a Site Reliability Engineering, or DevOps, or Java programming role.
- Experience managing production-grade systems and services on AKS/Kubernetes in distributed environments.
- Proficiency in programming and scripting languages including Python, Java, Bash, or Go.
- Proven experience with Spring Boot, Tomcat, Redis, and microservices architecture.
- Hands-on experience in managing Linux environments, particularly Ubuntu.
- Proficiency with observability stacks and performance monitoring using Datadog, Prometheus, and ELK.
- Deep understanding of containerization and orchestration using Docker, Kubernetes, and ArgoCD.
- Experience managing event-driven systems using Kafka.
- Expertise in IaC and automation using Terraform and GitHub Actions.
- Familiarity with networking concepts, DNS, load balancing, and cloud infrastructure (AWS, Azure, or GCP).
- Strong analytical, debugging, and problem-solving skills.
- Excellent verbal and written communication skills and the ability to collaborate effectively across teams.
Salary Range: $125,040 - $187,560
Actual compensation offered to a candidate may vary based on their unique qualifications and experience, internal equity, and market conditions. Final compensation decisions will be made in accordance with company policies and applicable laws.
#LI-KK1 #LI-Hybrid
At Ahold Delhaize USA, we embrace and celebrate diversity. Our employees and prospective employees are treated with fairness, respect and dignity. We provide an equal opportunity workplace committed to hiring, training, compensating, and promoting persons based on their talents and abilities and without regard to race, religion, color, national origin, gender, sexual orientation, age, family status, veteran status, disability status, or any other applicable characteristics protected by law.
Top Skills
Aks
Argocd
AWS
Azure
Bash
Datadog
Docker
Elk
GCP
Github Actions
Go
Java
Kafka
Kubernetes
Prometheus
Python
Redis
Spring Boot
Terraform
Tomcat
Ahold Delhaize USA Quincy, Massachusetts, USA Office
1385 Hancock St, Quincy, MA, United States, 02169
Similar Jobs at Ahold Delhaize USA
AdTech • eCommerce • Food • Marketing Tech • Retail
The Data Engineer II supports and optimizes data architectures, develops streaming data applications with Kafka, and ensures data delivery for various teams, handling moderate complexity projects from design to implementation.
Top Skills:
ExcelKafkaSQL
AdTech • eCommerce • Food • Marketing Tech • Retail
The Security Engineering Manager oversees security policies, manages incident response, and coordinates threat investigations to safeguard the technology environment.
Top Skills:
Cis ControlsIso/Iec 27001Mitre Att&CkNistSIEMSoc
AdTech • eCommerce • Food • Marketing Tech • Retail
The Director of Indirect Sourcing manages $2.5-$3b in Non-For-Resale items, leading strategic sourcing efforts, negotiating with suppliers, and ensuring alignment with organizational goals while developing sourcing plans and performance objectives.
What you need to know about the Boston Tech Scene
Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

