Top Senior Site Reliability Engineer Jobs in Boston, MA

Reposted 2 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 3 Days AgoSaved
Easy Apply
Hybrid
Boston, MA
Easy Apply
150K-185K Annually
Senior level
150K-185K Annually
Senior level
Enterprise Web • Hardware • Internet of Things • Software
The Senior Site Reliability Engineer will mentor teams on observability practices, architect systems for growth, automate developer tasks, and debug production issues.
Top Skills: GoKubernetesLgtm StackOpentelemetryPrometheusTypescript
Reposted 4 Hours AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
218K-257K Annually
Senior level
218K-257K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.
Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform
Reposted 11 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
119K-170K Annually
Senior level
119K-170K Annually
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
As a Staff Site Reliability Engineer, you'll oversee Zscaler production data center services, optimize code, and ensure cloud service availability and performance. Collaborate with cross-functional teams to improve processes and resolve escalated issues.
Top Skills: BashDnsFirewallsGrafanaHTTPIcmpLoad BalancingNagiosOsi ModelPrometheusPythonTcp/Ip
Reposted 11 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
126K-248K Annually
Senior level
126K-248K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Reposted 2 Days AgoSaved
Remote
Boston, MA
150K-220K Annually
Senior level
150K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills: AWSBashGoKubernetesPythonSlurmTerraform
Reposted 4 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
200K-230K Annually
Senior level
200K-230K Annually
Senior level
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks
Reposted 4 Days AgoSaved
Remote
Boston, MA
223K-302K Annually
Expert/Leader
223K-302K Annually
Expert/Leader
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.
Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos
5 Days AgoSaved
Remote or Hybrid
Boston, MA
200K-250K Annually
Senior level
200K-250K Annually
Senior level
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead long-term strategy and architecture for cloud and on‑prem platform infrastructure, driving Kubernetes and multi‑cloud reliability, IaC/GitOps automation, observability, SLO/SLI/error‑budget practices, incident leadership, AI‑augmented tooling adoption, and mentorship of senior engineers to improve platform resilience and developer experience.
Top Skills: Amazon Elastic Kubernetes Service (Eks)AutoscalingAWSCapacity PlanningCi/CdGitopsGoGoogle Cloud PlatformGoogle Kubernetes Engine (Gke)Identity And Access ManagementInfrastructure As CodeKubernetesLinuxNetworkingObservabilityOperatorsPulumiPythonRke2StorageTerraform
18 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Maintain and improve multi-cloud Kubernetes infrastructure, CI/CD (Argo Workflows/ArgoCD), observability, and networking. Build reliable continuous deployment tooling and onboarding flows, provide internal support, collaborate across Platform Engineering, contribute upstream (open-source/operators), and participate in a 24/7 on-call rotation to resolve deployment infrastructure issues.
Top Skills: AlertingArgo WorkflowsArgocdAWSAzureCi/CdContainersDnsGCPGoKubernetesLinuxLoad BalancerObservabilityPythonService MeshTcp/IpTls
Reposted 11 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Expert/Leader
127K-249K Annually
Expert/Leader
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Reposted 6 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
Reposted 2 Days AgoSaved
In-Office
Boston, MA
Senior level
Senior level
Hardware • Quantum Computing
Maintain and integrate hardware and software systems for quantum controls, manage lab and test infrastructure (HIL, K8s, networking, rack servers), automate provisioning and CI/CD, implement monitoring/alerting and observability, support incident response and root-cause analysis, and define operational procedures to ensure reliability across development and production environments.
Top Skills: AnsibleBashDebianDhcpDnsDockerElk StackGitGitlab CiGoGrafanaHardware-In-The-Loop (Hil)JenkinsKubernetesLanPrometheusPythonRack Mount ServersRed HatRoutersSwitchesTcp/IpTerraformUbuntuVlanWanWindows
Reposted 21 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills: AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 21 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
Mid level
Mid level
Cloud • Security • Software • Cybersecurity • Automation
As a Cloud Cost Utilization SRE at GitLab, you'll manage cloud spending, improve tracking and optimization of cloud usage, and collaborate with finance and engineering teams to enhance cost efficiency across AWS and GCP.
Top Skills: AnsibleAWSElkGCPGrafanaLokiMimirPrometheusTempoTerraform
Reposted 4 Hours AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, automation, and DevOps for Coinbase's corporate IAM platform: on-call/incident response, CI/CD and IaC pipelines, identity lifecycle tooling, observability and disaster recovery, documentation, and cross-team IAM advisement to ensure secure, scalable access for a global workforce.
Top Skills: AbacAuth0AWSAzureC#Ci/CdContainer OrchestrationDuoEntraidGCPGenerative AiGitGoIacJavaMfaOktaPingPythonRbacRubySsoTerraform
Reposted 4 Hours AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Senior SRE on the IT Operations team owning reliability, monitoring, and incident response for AI infrastructure. Build automation, CI/CD and Kubernetes tooling, improve observability and documentation, and develop internal full-stack tools using Go or Python. Partner with Infrastructure, Security, and Compliance to scale secure, resilient AI deployment pipelines.
Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxPuppetPythonRubySaltTerraform
4 Days AgoSaved
In-Office
Boston, MA
75K-95K Annually
Entry level
75K-95K Annually
Entry level
Fintech • Payments
Entry-level Site Reliability Engineer supporting system reliability, monitoring, incident triage, and root-cause analysis. Develop basic automation and scripts, follow deployment/change processes, collaborate with senior engineers, and contribute to observability and incident/problem management to improve system resilience and scalability.
Top Skills: BashDockerKubernetesLinuxPowershellPythonUnix
Reposted 4 Days AgoSaved
In-Office
Boston, MA
135K-165K Annually
Mid level
135K-165K Annually
Mid level
Artificial Intelligence
The Site Reliability Engineer II will enhance infrastructure and software reliability, write efficient code, collaborate across teams, and maintain platforms and monitoring tools.
Top Skills: AWSCi/CdCoralogixDockerJavaScriptKubernetesPythonSentryTerraformUnix Shell
Reposted 4 Days AgoSaved
In-Office or Remote
Boston, MA
95K-171K Annually
Junior
95K-171K Annually
Junior
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.
Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform
Reposted 4 Days AgoSaved
In-Office or Remote
Boston, MA
140K-205K Annually
Senior level
140K-205K Annually
Senior level
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills: AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Reposted 4 Days AgoSaved
In-Office
Boston, MA
Mid level
Mid level
Cloud • Information Technology • Biotech
The Site Reliability Engineer will build and deploy Linux servers, research technologies, monitor system performance, and resolve technical incidents.
Top Skills: Infrastructure-As-CodeLinuxNetworkingVirtualization
Reposted 24 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
6 Days AgoSaved
In-Office or Remote
Boston, MA
76K-136K Annually
Mid level
76K-136K Annually
Mid level
Cloud • Security • Software • Cybersecurity
Design, develop, test, and operate scalable infrastructure and services for Akamai Cloud. Implement and manage Infrastructure-as-Code (Terraform and similar tools), CI/CD, and observability. Automate reliability improvements, mentor engineers, collaborate on incident response and root-cause remediation, and participate in on-call rotations.
Top Skills: Alerting)AnsibleChefCi/CdInfrastructure As CodeLinuxLoggingObservability (MonitoringPuppetSaltstackTerraform
6 Days AgoSaved
In-Office
Boston, MA
160K-225K Annually
Senior level
160K-225K Annually
Senior level
Hardware • Quantum Computing
Lead integration, maintenance, and automation of heterogeneous hardware and software control systems for quantum computers. Manage networked lab infrastructure, CI/CD pipelines, observability, and provisioning. Support incident response, testing, and orchestration, collaborating with software, hardware, and test teams to ensure reliability and operational readiness of development and production environments.
Top Skills: AnsibleBashCi/CdDebianDhcpDnsDockerElkGitGitlab CiGoGrafanaHardware-In-The-Loop (Hil)JenkinsKubernetesLanLogging SystemsPrometheusPythonRack-Mount ServersRed HatRoutersSwitchesTcp/IpTerraformUbuntuVlanWanWindows
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account