Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Senior Site Reliability Engineer Jobs in Boston, MA

AlphaSense

Staff Site Reliability Engineer

6 Days AgoSaved

Remote or Hybrid

Boston, MA

150K-225K Annually

Senior level

150K-225K Annually

Senior level

Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence

Lead architecture and implementation of reliability platforms and SRE practices for a production SaaS. Build self-service reliability tooling, drive AIOps automation, advance observability (monitoring, tracing, profiling), lead incident response and postmortems, mentor engineers, and embed production readiness across teams to achieve 99.99% uptime.

Top Skills: AWSAzureContinuous ProfilingDatadogDnsElkGCPGoGrafanaHttp/SKubernetesLoad BalancingOpentelemetryPrometheusPythonTcp/Ip

Sonio

Site Reliability Engineer (SRE) - Boston

Reposted 15 Days AgoSaved

Hybrid

Boston, MA

165K-190K Annually

Mid level

165K-190K Annually

Mid level

Artificial Intelligence • Healthtech • Information Technology • Software

As the first Site Reliability Engineer in the US, you'll ensure platform stability and oversee incident responses during PST hours, bridging infrastructure and code, while improving operability and compliance in a medical-device environment.

Top Skills: AWSElixirKubernetesTerraform

OXIO

Site Reliability Engineer

Reposted 7 Days AgoSaved

Remote

Boston, MA

Mid level

Other

As a Site Reliability Engineer, you will design cloud platforms, automate operations, maintain infrastructure, and support engineering teams in delivering reliable services.

Top Skills: AnsibleAWSAzureBashCircleCICloudFormationDatadogDnsDockerGitlab CiGoGCPGrafanaHTTPHttpsJenkinsKubernetesKvmLinuxPerlPrometheusPythonRubyTcp/IpTerraformUnixVMware

TherapyNotes, LLC

Senior Database Site Reliability Engineer

Reposted 7 Days AgoSaved

Remote

Boston, MA

120K-160K Annually

Senior level

120K-160K Annually

Senior level

Healthtech • Other • Software

As a Senior Database Site Reliability Engineer, you'll design, implement, and maintain PostgreSQL systems, ensure reliability, automate maintenance tasks, and participate in incident response.

Top Skills: AnsibleBashDatadogGrafanaNew RelicPostgresPowershellPrometheusPythonTerraform

OneStream Software

Site Reliability Engineer

Reposted 7 Days AgoSaved

Remote

Boston, MA

114K-148K Annually

Senior level

114K-148K Annually

Senior level

Software • Financial Services

Ensure platform reliability, performance, and availability by implementing observability, automating infrastructure, participating in on-call rotations and post-mortems, partnering with Product and Engineering, designing scalable architectures, mentoring teammates, and integrating Dynatrace with Azure DevOps and Jira while supporting compliance (SOC/FedRAMP).

Top Skills: .NetAksAlpineAnsibleAppinsightsArm TemplatesAWSAzure DevopsBashBicepC#ChefCloudFormationDatadogDebianDynatraceEksGCPGitGitGksGrafanaHelmJIRAKubernetesLog AnalyticsAzureNew RelicOnestream SoftwareOpenshiftPowershellPowershell DscPrometheusPuppetPythonRest ApisSQLTerraformUbuntu

Alpaca

Staff Site Reliability Engineer, Database

Reposted 7 Days AgoSaved

Remote

Boston, MA

Senior level

Fintech • Information Technology

As a Site Reliability Engineer at Alpaca, you will ensure system reliability and performance, troubleshoot issues, and collaborate with teams to design scalable features.

Top Skills: GoGormLinuxPgxPostgresPrometheusSqlc

Chess.com

Site Reliability Engineer

Reposted 7 Days AgoSaved

Remote

Boston, MA

Senior level

Gaming • Software

The Site Reliability Engineer will manage infrastructure stability and scalability, lead cloud migrations, and optimize performance across systems while mentoring team members.

Top Skills: AnsibleAWSAzureBashChefCloudFormationDatadogDockerElk StackGCPGoGrafanaKubernetesPrometheusPuppetPythonTerraformUnix/Linux

Kong

Staff Site Reliability Engineer - Volcano

7 Days AgoSaved

Remote

Boston, MA

150K-210K Annually

Senior level

150K-210K Annually

Senior level

Artificial Intelligence • Cloud • Information Technology • Software • Big Data Analytics

Founding Staff SRE for Volcano: define SLOs/error budgets, architect multi-region Kubernetes infrastructure, build GitOps/CI-CD with ArgoCD/Helm/Terraform, scale managed Postgres/Redis/object storage, implement observability with Datadog/Prometheus/Grafana, lead incident response and SRE culture, and mentor cross-functional teams.

Top Skills: ArgocdCanary DeploymentsCi/CdCniDatadogGitopsGrafanaHelmIngressKubernetesObject StoragePostgresPrometheusRedisService MeshTerraformTerragrunt

Alegeus

Site Reliability Engineer I

16 Days AgoSaved

In-Office

Boston, MA

53K-90K Annually

Junior

53K-90K Annually

Junior

Healthtech • Financial Services

Support and maintain production, beta, and development web applications with rotating on-call duties. Troubleshoot complex incidents, perform root cause analysis, collaborate across teams, support deployments in on-prem and cloud (AWS/Azure), and ensure SLA compliance while participating in Agile/SAFe processes.

Top Skills: AWSAzureC#GitJavaPostgresPythonSQL

WorkOS

Site Reliability Engineer

Reposted 7 Days AgoSaved

Remote

Boston, MA

175K-275K Annually

Mid level

175K-275K Annually

Mid level

Software

As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.

Top Skills: AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript

Phantom (phantom.com)

Staff Software Engineer (SRE)

Reposted 9 Days AgoSaved

Remote

Boston, MA

200K-250K Annually

Senior level

200K-250K Annually

Senior level

Software • Cryptocurrency

Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.

Top Skills: Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform

Manifold

Staff Site Reliability Engineer

19 Days AgoSaved

In-Office

Boston, MA

160K-205K Annually

Senior level

160K-205K Annually

Senior level

Software

Design, build, and operate multi-account cloud infrastructure using IaC. Automate customer deployments, manage CI/CD, troubleshoot production across infra/data/app layers, and handle networking, security, and compliance for regulated environments while collaborating with platform and professional services teams.

Top Skills: AirflowAuth0AWSAzureDbtDockerEcsGCPGithub ActionsLlmsOktaPackerPostgresSnowflakeTailscaleTerraformWireguard

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

Supabase

Site Reliability Engineer

10 Days AgoSaved

Remote

Boston, MA

Senior level

Database

Embed with service teams to define SLIs/SLOs and error budgets, run Operational Readiness Reviews, improve incident-to-improvement pipelines, advise on resilience and architecture, reduce operational toil through automation, and shape org-wide on-call practices and operational maturity.

Top Skills: AWSCdkGrafanaKubernetesOpentelemetryPostgresPulumiTerraformVictoriametrics

GE Vernova

SRE Platform Engineer

10 Days AgoSaved

Remote

Boston, MA

Senior level

Energy • Manufacturing • Solar • Renewable Energy

Operate and harden production EKS Kubernetes clusters across multiple AWS regions. Build IaC (Terraform, Ansible), implement policy-as-code, ensure security and compliance, manage observability (Prometheus/Grafana), perform L3 support and incident RCA, run platform-level testing and DR, automate toil, and partner with application teams for sizing and cost optimization to achieve high availability for critical cloud infrastructure.

Top Skills: AlbAnsibleArgocdAws Ec2Certificate ManagementDatadogDynatraceEksFluxGoGrafanaKubernetesMskPod PriorityPrometheusPythonRdsS3Service MeshSplunkTerraformVpc

HHAeXchange

SRE Technical Project Manager

Reposted 10 Days AgoSaved

Remote

Boston, MA

100K-110K Annually

Mid level

100K-110K Annually

Mid level

Healthtech • Software

The SRE Technical Project Manager will lead project delivery, incident management, automation processes, and uptime communication, partnering with SRE and development teams to ensure system stability and scalability.

Top Skills: Ai BotsDatadogJIRAJira Service ManagementMs TeamsOpsgeniePagerduty

SitusAMC

Site Reliability Engineer - AWS - Remote

Reposted 11 Days AgoSaved

Remote

Boston, MA

110K-140K Annually

Senior level

110K-140K Annually

Senior level

Real Estate • Financial Services • PropTech

Support and optimize products migrated to AWS, implement cloud best practices, maintain operational coverage, enhance automation, observability, CI/CD/GitOps, and security. Collaborate with development and platform teams to scale, troubleshoot, and ensure reliable SaaS operations.

Top Skills: AmisArgocdAWSAws Elastic BeanstalkAws Transfer FamilyAzure DevopsBashCloudwatchCurlDockerEc2EksFluxcdGitGitopsHTTPIstioKubernetesLinkerdLoad BalancerPowershellPythonRdsSQLTerraformWget

Tradeweb

Senior Site Reliability Engineer (SRE)

12 Days AgoSaved

Remote

Boston, MA

170K-210K Annually

Senior level

170K-210K Annually

Senior level

eCommerce

Ensure reliability and availability of Tradeweb's global AWS platform through IaC automation, observability and SLO definition, incident triage and resolution, on-call duties, collaboration with development teams, and security-focused platform improvements.

Top Skills: ArgocdAWSAws LambdaEksGitsecopsInfrastructure As Code (Iac)Kubernetes (K8S)KustomizeLgtmLinux/UnixPulumiPythonSmsSns

QuEra Computing

Sr. Control System Engineer/Site Reliability Engineer (SRE)

Reposted 22 Days AgoSaved

In-Office

Boston, MA

Senior level

Hardware • Quantum Computing

Maintain and integrate hardware and software systems for quantum controls, manage lab and test infrastructure (HIL, K8s, networking, rack servers), automate provisioning and CI/CD, implement monitoring/alerting and observability, support incident response and root-cause analysis, and define operational procedures to ensure reliability across development and production environments.

Top Skills: AnsibleBashDebianDhcpDnsDockerElk StackGitGitlab CiGoGrafanaHardware-In-The-Loop (Hil)JenkinsKubernetesLanPrometheusPythonRack Mount ServersRed HatRoutersSwitchesTcp/IpTerraformUbuntuVlanWanWindows

Circle (circle.so)

Senior Site Reliability Engineer

19 Days AgoSaved

Easy Apply

Remote

Boston, MA

Easy Apply

130K-140K Annually

Senior level

130K-140K Annually

Senior level

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software

Lead SRE work to keep Circle highly available and performant: respond to incidents, own monitoring/alerting/log management, manage and optimize MySQL/Postgres/ClickHouse/Redis databases, maintain server infrastructure and deployment pipelines, collaborate with engineering teams, and build internal SRE tooling and automation.

Top Skills: AWSClickhouseKubernetesLlm-Based Tools (Copilots)MySQLPostgresRedis

SimSpace

Staff Site Reliability Engineer

Reposted 13 Days AgoSaved

Remote

Boston, MA

165K-230K Annually

Senior level

165K-230K Annually

Senior level

Information Technology • Security

The Staff Site Reliability Engineer will lead the architecture and security of the SimSpace cyber range platform, focusing on reliability, automation, and observability across diverse deployment environments while mentoring engineers and driving infrastructure initiatives.

Top Skills: ArgocdGithub ActionsGoGrafana TankaJsonnetKubernetesPython

Andromeda (andromeda.ai)

Staff SRE, AI Infrastructure

Reposted 13 Days AgoSaved

In-Office or Remote

Boston, MA

Senior level

Artificial Intelligence • Cloud • Information Technology • Software

As a Staff SRE, you will ensure the reliability and performance of Andromeda's GPU infrastructure, lead incident responses, build observability systems, and mentor engineers, while collaborating closely with engineering and customers.

Top Skills: AnsibleCudaGoHelmKubernetesLinuxNcclNvidiaPythonRustSlurmTerraform

Arista Networks

FedRAMP Site Reliability Engineer (FedSRE) - CloudVision

Reposted 13 Days AgoSaved

Remote

Boston, MA

101K-161K Annually

Senior level

101K-161K Annually

Senior level

Cloud • Software • Analytics

Join Arista Networks as a Site Reliability Engineer to manage CloudVision service reliability, scalability, and stability in a FedRAMP environment, focusing on areas like architecture, security, and performance optimization.

Top Skills: AnsibleBashGCPGkeGoKubernetesPulumiPython

Xometry

Staff Site Reliability Engineer (SRE)

Reposted 24 Days AgoSaved

In-Office

Boston, MA

135K-165K Annually

Mid level

135K-165K Annually

Mid level

Artificial Intelligence

The Site Reliability Engineer II will enhance infrastructure and software reliability, write efficient code, collaborate across teams, and maintain platforms and monitoring tools.

Top Skills: AWSCi/CdCoralogixDockerJavaScriptKubernetesPythonSentryTerraformUnix Shell

Coinbase

Senior Site Reliability Engineer, Workforce Identity

20 Days AgoSaved

Easy Apply

Remote

Boston, MA

Easy Apply

186K-219K Annually

Senior level

186K-219K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

Own reliability, automation, and DevOps for Coinbase's corporate IAM platform: on-call/incident response, CI/CD and IaC pipelines, identity lifecycle tooling, observability and disaster recovery, documentation, and cross-team IAM advisement to ensure secure, scalable access for a global workforce.

Top Skills: AbacAuth0AWSAzureC#Ci/CdContainer OrchestrationDuoEntraidGCPGenerative AiGitGoIacJavaMfaOktaPingPythonRbacRubySsoTerraform

Coinbase

Senior Site Reliability Engineer, Core AI Infrastructure

20 Days AgoSaved

Easy Apply

Remote

Boston, MA

Easy Apply

186K-219K Annually

Senior level

186K-219K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

Senior SRE on the IT Operations team owning reliability, monitoring, and incident response for AI infrastructure. Build automation, CI/CD and Kubernetes tooling, improve observability and documentation, and develop internal full-stack tools using Go or Python. Partner with Infrastructure, Security, and Compliance to scale secure, resilient AI deployment pipelines.

Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxPuppetPythonRubySaltTerraform

Let Your Resume Do The Work

Upload your resume to be matched with jobs you're a great fit for.

All Filters

Early Applicant

JobType

New Jobs

Job Category

Experience

Industry

Company Name

Find Company

Company Size

Sign up now Access later

Create Free Account

Already have an account? Log In

Top Senior Site Reliability Engineer Jobs in Boston, MA

Staff Site Reliability Engineer

Site Reliability Engineer (SRE) - Boston

Site Reliability Engineer

Senior Database Site Reliability Engineer

Site Reliability Engineer

Staff Site Reliability Engineer, Database

Site Reliability Engineer

Staff Site Reliability Engineer - Volcano

Site Reliability Engineer I

Site Reliability Engineer

Staff Software Engineer (SRE)

Staff Site Reliability Engineer

Track Smarter, Apply Better.

Site Reliability Engineer

SRE Platform Engineer

SRE Technical Project Manager

Site Reliability Engineer - AWS - Remote

Senior Site Reliability Engineer (SRE)

Sr. Control System Engineer/Site Reliability Engineer (SRE)

Senior Site Reliability Engineer

Staff Site Reliability Engineer

Staff SRE, AI Infrastructure

FedRAMP Site Reliability Engineer (FedSRE) - CloudVision

Staff Site Reliability Engineer (SRE)

Senior Site Reliability Engineer, Workforce Identity

Senior Site Reliability Engineer, Core AI Infrastructure

Popular Job Searches

Total selected ()