Top Senior Site Reliability Engineer Jobs in Boston, MA

6 Hours AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
200K-230K Annually
Senior level
200K-230K Annually
Senior level
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks
Reposted 9 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 9 Days AgoSaved
Easy Apply
Hybrid
Boston, MA
Easy Apply
150K-185K Annually
Senior level
150K-185K Annually
Senior level
Enterprise Web • Hardware • Internet of Things • Software
The Senior Site Reliability Engineer will mentor teams on observability practices, architect systems for growth, automate developer tasks, and debug production issues.
Top Skills: GoKubernetesLgtm StackOpentelemetryPrometheusTypescript
Reposted 11 Days AgoSaved
Hybrid
Boston, MA
Mid level
Mid level
Information Technology • Web3
The Site Reliability Engineer manages AWS Kubernetes infrastructure, ensuring operational excellence, security, and scalability, while implementing reliability improvements and best practices.
Top Skills: ArgocdAWSBashDatadogEksGoKafkaKubernetesPostgresPythonSysdigTerraform
Reposted 2 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills: AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 5 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
7 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
218K-257K Annually
Senior level
218K-257K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.
Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform
Reposted 18 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
126K-248K Annually
Senior level
126K-248K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Reposted 12 Days AgoSaved
Remote
Boston, MA
150K-220K Annually
Senior level
150K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills: AWSBashGoKubernetesPythonSlurmTerraform
14 Days AgoSaved
Remote
Boston, MA
223K-302K Annually
Expert/Leader
223K-302K Annually
Expert/Leader
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.
Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos
Reposted 18 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Expert/Leader
127K-249K Annually
Expert/Leader
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
7 Days AgoSaved
Hybrid
Boston, MA
125K-188K Annually
Senior level
125K-188K Annually
Senior level
AdTech • eCommerce • Food • Marketing Tech • Retail
Lead design and implementation of cloud-native, highly available infrastructure and automation. Improve reliability via IaC, observability, incident response, SLOs, CI/CD, Kafka-based architectures, on-call support, mentoring, and cross-team reliability initiatives.
Top Skills: AksArgocdAWSAzureBashDatadogDockerElkGCPGitGithub ActionsGitopsGoJavaKafkaKubernetesPrometheusPythonRedisSpring BootTerraformTomcatUbuntu
New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free
Application Tracker Preview
Reposted 2 Days AgoSaved
Hybrid
Boston, MA
165K-190K Annually
Mid level
165K-190K Annually
Mid level
Artificial Intelligence • Healthtech • Information Technology • Software
As the first Site Reliability Engineer in the US, you'll ensure platform stability and oversee incident responses during PST hours, bridging infrastructure and code, while improving operability and compliance in a medical-device environment.
Top Skills: AWSElixirKubernetesTerraform
3 Days AgoSaved
In-Office
Boston, MA
53K-90K Annually
Junior
53K-90K Annually
Junior
Healthtech • Financial Services
Support and maintain production, beta, and development web applications with rotating on-call duties. Troubleshoot complex incidents, perform root cause analysis, collaborate across teams, support deployments in on-prem and cloud (AWS/Azure), and ensure SLA compliance while participating in Agile/SAFe processes.
Top Skills: AWSAzureC#GitJavaPostgresPythonSQL
Reposted 23 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
195K-270K Annually
Expert/Leader
195K-270K Annually
Expert/Leader
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Principal Software Engineer on the SRE team, lead best practices adoption, mentor engineers, and improve system reliability and user experience through automation and collaboration.
Top Skills: CdkCloudFormationDatadogGoJavaScriptPrometheusPythonTerraformTypescript
25 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
150K-200K Annually
Senior level
150K-200K Annually
Senior level
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
As a Site Reliability Engineer, you will ensure system stability and resilience, define reliability standards, and automate operational processes while collaborating cross-functionally to improve performance and reduce incidents.
Top Skills: BashCi/CdDockerGoGrafanaKubernetesLinuxPrometheusPython
6 Days AgoSaved
In-Office
Boston, MA
160K-205K Annually
Senior level
160K-205K Annually
Senior level
Software
Design, build, and operate multi-account cloud infrastructure using IaC. Automate customer deployments, manage CI/CD, troubleshoot production across infra/data/app layers, and handle networking, security, and compliance for regulated environments while collaborating with platform and professional services teams.
Top Skills: AirflowAuth0AWSAzureDbtDockerEcsGCPGithub ActionsLlmsOktaPackerPostgresSnowflakeTailscaleTerraformWireguard
Reposted 13 Days AgoSaved
Easy Apply
Remote or Hybrid
Boston, MA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Reposted 9 Days AgoSaved
In-Office
Boston, MA
Senior level
Senior level
Hardware • Quantum Computing
Maintain and integrate hardware and software systems for quantum controls, manage lab and test infrastructure (HIL, K8s, networking, rack servers), automate provisioning and CI/CD, implement monitoring/alerting and observability, support incident response and root-cause analysis, and define operational procedures to ensure reliability across development and production environments.
Top Skills: AnsibleBashDebianDhcpDnsDockerElk StackGitGitlab CiGoGrafanaHardware-In-The-Loop (Hil)JenkinsKubernetesLanPrometheusPythonRack Mount ServersRed HatRoutersSwitchesTcp/IpTerraformUbuntuVlanWanWindows
6 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
130K-140K Annually
Senior level
130K-140K Annually
Senior level
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Lead SRE work to keep Circle highly available and performant: respond to incidents, own monitoring/alerting/log management, manage and optimize MySQL/Postgres/ClickHouse/Redis databases, maintain server infrastructure and deployment pipelines, collaborate with engineering teams, and build internal SRE tooling and automation.
Top Skills: AWSClickhouseKubernetesLlm-Based Tools (Copilots)MySQLPostgresRedis
Reposted 20 Hours AgoSaved
Remote
Boston, MA
165K-230K Annually
Senior level
165K-230K Annually
Senior level
Information Technology • Security
The Staff Site Reliability Engineer will lead the architecture and security of the SimSpace cyber range platform, focusing on reliability, automation, and observability across diverse deployment environments while mentoring engineers and driving infrastructure initiatives.
Top Skills: ArgocdGithub ActionsGoGrafana TankaJsonnetKubernetesPython
Reposted 20 Hours AgoSaved
In-Office or Remote
Boston, MA
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Software
As a Staff SRE, you will ensure the reliability and performance of Andromeda's GPU infrastructure, lead incident responses, build observability systems, and mentor engineers, while collaborating closely with engineering and customers.
Top Skills: AnsibleCudaGoHelmKubernetesLinuxNcclNvidiaPythonRustSlurmTerraform
Reposted 20 Hours AgoSaved
Remote
Boston, MA
101K-161K Annually
Senior level
101K-161K Annually
Senior level
Cloud • Software • Analytics
Join Arista Networks as a Site Reliability Engineer to manage CloudVision service reliability, scalability, and stability in a FedRAMP environment, focusing on areas like architecture, security, and performance optimization.
Top Skills: AnsibleBashGCPGkeGoKubernetesPulumiPython
YesterdaySaved
Remote
Boston, MA
140K-150K Annually
Mid level
140K-150K Annually
Mid level
Healthtech
Design, provision, and operate AWS infrastructure using Terraform; run and scale Kubernetes workloads with Helm; build observability, monitoring, and CI/CD automation; define SLIs/SLOs and lead incident response and postmortems; implement security and compliance (HIPAA/SOC2); participate in on-call rotation and partner with product and engineering on capacity, performance, and resilient system design.
Top Skills: ArgocdAWSAws Secrets ManagerCi/CdClickhouseCloudwatchDatadogEvent SourcingFluxGoGrafanaHashicorp VaultHelmKubernetesLinuxMySQLOpentelemetryPostgresPrometheusPythonRedshiftSignozSnowflakeTerraform
7 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, automation, and DevOps for Coinbase's corporate IAM platform: on-call/incident response, CI/CD and IaC pipelines, identity lifecycle tooling, observability and disaster recovery, documentation, and cross-team IAM advisement to ensure secure, scalable access for a global workforce.
Top Skills: AbacAuth0AWSAzureC#Ci/CdContainer OrchestrationDuoEntraidGCPGenerative AiGitGoIacJavaMfaOktaPingPythonRbacRubySsoTerraform
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account