Get the job you really want.

Top Senior Site Reliability Engineer Jobs in Boston, MA

Reposted 6 Days AgoSaved
Remote
Boston, MA
Mid level
Mid level
Security • Software • Analytics
Design, operate, and automate scalable, secure infrastructure for Axiom Cloud. Define SLOs, plan disaster recovery and capacity, tune performance, improve deployment practices, build reliability tooling, respond to incidents, and promote monitoring and observability across teams.
Top Skills: Amazon EksAWSCircleCIDockerGithub ActionsGitlabGoKubernetesLinuxLlmsMonitoring And Observability ToolsPulumiTerraform
Reposted 6 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
170K-215K Annually
Senior level
170K-215K Annually
Senior level
Blockchain • Fintech • Social Media • Cryptocurrency • NFT • Web3
Design, build, and operate scalable, highly available infrastructure and platform software for Zora's blockchain services (indexer, APIs, data pipelines). Automate workflows, maintain core systems, improve developer experience, participate in on-call rotation, and contribute strategic technical direction.
Top Skills: AsyncioBaseBridgesCephCloudflare Pages FunctionsDatadogDockerEthereumGoIpfsKubernetesMongoDBOpentelemetryOptimismOptimistic RollupsPlasmaPolygonPostgresPythonRpc NodesSidechainsVercelZk-Rollups
7 Days AgoSaved
Remote
Boston, MA
96K-192K Annually
Senior level
96K-192K Annually
Senior level
Blockchain • Financial Services • Cryptocurrency • Web3
As a Senior Site Reliability Engineer, you will manage the reliability and efficiency of Kraken's Data platform, working with multiple teams to ensure high performance and scalability. Responsibilities include designing data governance mechanisms, managing CI/CD pipelines, implementing monitoring solutions, and collaborating on various data projects.
Top Skills: Apache AirflowSparkAWSDebeziumDockerKafkaKubernetesPythonTerraform
7 Days AgoSaved
Remote
Boston, MA
96K-192K Annually
Mid level
96K-192K Annually
Mid level
Blockchain • Financial Services • Cryptocurrency • Web3
As a SRE/DevOps Engineer at Kraken, you will build infrastructure, support tools, drive standardization, and guide engineers in an efficient remote environment.
Top Skills: BashContinuous IntegrationDockerGitGrafanaLinuxPrometheusPythonRustTerraform
Reposted 7 Days AgoSaved
Remote or Hybrid
Boston, MA
132K-195K Annually
Senior level
132K-195K Annually
Senior level
Artificial Intelligence • Big Data • Computer Vision • Machine Learning • Natural Language Processing • Software • Cybersecurity
Maintain and improve the internal developer platform, observability stack, and AWS infrastructure (Terraform); manage Kubernetes at scale; troubleshoot distributed systems; drive security, reliability, cost and performance improvements; partner with product teams and participate in on-call support.
Top Skills: AWSCkaContainersGoKubernetesLgtm StackLinuxOpensearchPythonServerlessTcp/IpTerraform
Reposted 8 Days AgoSaved
In-Office or Remote
Boston, MA
92K-167K Annually
Senior level
92K-167K Annually
Senior level
Information Technology • Software
The Site Reliability Engineer manages system reliability, performance, and scalability for end-user services, leading software deployments, incident management, and service quality improvements. Responsibilities include collaboration with teams, maintaining a product roadmap, and automation of processes.
Top Skills: AgileAternityDevsecopsItilPowershellPython
Reposted 8 Days AgoSaved
Remote
Boston, MA
115K-135K Annually
Mid level
115K-135K Annually
Mid level
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills: ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Reposted 8 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
Senior level
Senior level
Security • Software
Maintain, automate, and improve operational tools and customer deployment processes; monitor and ensure service SLOs, backup/restore, alerting, and incident response; drive GitOps/IaC practices, cost tracking, and automation of repetitive tasks while supporting outages and upgrades.
Top Skills: AnsibleAWSAzureBashGCPGitopsGrafanaHelmKubernetesPrometheusPythonTerraform
9 Days AgoSaved
Remote
Boston, MA
208K-330K Annually
Senior level
208K-330K Annually
Senior level
Fintech
The Staff Site Reliability Engineer role involves leading architecture, automating GCP environment, defining SLIs and SLOs, mentoring teammates, and enhancing system reliability and performance.
Top Skills: ArgocdDatadogGCPGoHelmJavaScriptKubernetesPythonTerraformTypescript
Reposted 9 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
220K-250K Annually
Expert/Leader
220K-250K Annually
Expert/Leader
Cloud • Software • Database
Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.
Top Skills: AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform
10 Days AgoSaved
Remote or Hybrid
Boston, MA
5-5 Annually
Senior level
5-5 Annually
Senior level
Database
The Site Reliability Engineer will oversee the Digital Realty interconnection fabric network infrastructure, focusing on network operations, automation, and development. Responsibilities include maintaining global network infrastructure, responding to alerts, and working with various cloud platforms and automation tools.
Top Skills: AnsibleAWSAzureGitGCPIbm CloudJenkinsLinuxOracle CloudPythonTerraform
11 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
172K-215K Annually
Expert/Leader
172K-215K Annually
Expert/Leader
Aerospace • Big Data • Greentech • Hardware • Social Impact
The Site Reliability Engineer will build, deploy, and operate computing services for satellite imaging, ensuring reliable and scalable infrastructure while collaborating with cross-functional teams.
Top Skills: AlloyAnsibleBashCloud-Native InfrastructureGrafanaHelmK3SKubernetesKustomizeOpentelemetryPrometheusProxmoxPythonRke2TalosTerraform
New

Cut your apply time in half.

Use ourAI Assistantto automatically fill your job applications.

Use For Free
Application Tracker Preview
11 Days AgoSaved
Remote
Boston, MA
160K-250K Annually
Senior level
160K-250K Annually
Senior level
Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
As a Site Reliability Engineer at Replit, you'll enhance system reliability through observability, automation, incident management, and performance optimization, serving millions globally.
Top Skills: AnsibleDatadogGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPulumiPythonTerraform
11 Days AgoSaved
Remote
Boston, MA
220K-325K Annually
Senior level
220K-325K Annually
Senior level
Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
As a Staff Site Reliability Engineer at Replit, you will ensure infrastructure reliability, drive automation, lead incident management, and mentor the engineering team while enhancing system performance and observability.
Top Skills: DatadogGoGoogle Cloud PlatformGrafanaKubernetesOpentelemetryPrometheusPythonTerraform
Reposted 20 Days AgoSaved
In-Office
Boston, MA
81K-97K Annually
Mid level
81K-97K Annually
Mid level
Fintech • Payments
Join WEX as a Site Reliability Engineer to enhance system reliability in Azure, automate tasks, and collaborate with teams to optimize performance and incident response.
Top Skills: AnsibleAzureBashCloudFormationDockerElk StackGoGrafanaKubernetesPrometheusPythonSplunkTerraform
12 Days AgoSaved
Remote
Boston, MA
Senior level
Senior level
Logistics • Software • Transportation
Lead and mentor teams in DevOps and SRE, architect scalable Azure Cloud infrastructure, implement CI/CD and IaC, ensure database reliability, and drive cross-functional collaboration.
Top Skills: Azure CloudAzure DevopsCi/CdCosmosdbDockerElkGrafanaKubernetesMySQLPostgresPrometheusRedisSQL ServerTerraform
Reposted 12 Days AgoSaved
Remote
Boston, MA
Mid level
Mid level
Healthtech • Software
Maintain reliability, performance, and scalability of cloud-hosted services and databases. Implement SRE best practices, define SLIs/SLOs, respond to incidents, build monitoring and automation, perform DBA tasks (backups, restores, tuning), support CI/CD and DB migrations, and document runbooks and procedures.
Top Skills: Amazon RdsAzure Sql DatabaseBashEcs FargateFlywayGitlabJenkinsKubernetesLiquibaseOctopus DeployOraclePostgresPowershellPythonRedisSolarwinds DpaSQL Server
Reposted 12 Days AgoSaved
In-Office or Remote
Boston, MA
Senior level
Senior level
Software
The role involves managing compute infrastructure for decentralized applications, requiring critical thinking, documentation skills, and experience in Kubernetes and blockchain management.
Top Skills: BlockchainGitopsInfrastructure-As-CodeKubernetesProgramming Languages
Reposted 13 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
Senior level
Senior level
Artificial Intelligence • eCommerce • Retail
Lead the SRE and DevOps team, ensure infrastructure reliability, oversee cloud operations, drive automation, and collaborate cross-functionally.
Top Skills: AzureBashCi/CdDatadogDockerElk StackGoGrafanaKubernetesPowershellPrometheusPythonTerraform
Reposted 13 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
172K-215K Annually
Senior level
172K-215K Annually
Senior level
Aerospace • Big Data • Greentech • Hardware • Social Impact
Design, deploy, and operate compute services for on-premises and cloud satellite imaging platforms. Build reproducible, scalable, highly available deployments, troubleshoot distributed systems, optimize constrained environments, document and automate operations, and participate in on-call rotations to ensure reliability for customer-facing and air-gapped deployments.
Top Skills: AlloyAnsibleBashCudaGitopsGrafanaHelmJIRAK3SKubernetesKustomizeOpentelemetryPrometheusProxmoxPythonRke2TalosTerraform
Reposted 13 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
150K-185K Annually
Mid level
150K-185K Annually
Mid level
Software
Join the SRE team to improve monitoring, alerting, observability, and reliability of Fireblocks' production systems. Triage incidents, run RCA, create runbooks and automation (Python, Lambda, shell, Ansible, ArgoCD), collaborate with R&D/support, and participate in on-call rotation.
Top Skills: AnsibleArgocdAWSAws LambdaAzureBashBitbucketC++ChefCoralogixDatadogDockerGerritGitGitlabGCPHelmJavaScriptKubernetesLinuxMySQLNew RelicNginxNode.jsPhabricatorPrometheusPuppetPythonShellSplunk
Reposted 13 Days AgoSaved
Remote
Boston, MA
110K-130K Annually
Senior level
110K-130K Annually
Senior level
Real Estate • Financial Services • PropTech
As a Site Reliability Engineer, you will support AWS Cloud products, optimize processes, enhance automation, and ensure system reliability and performance.
Top Skills: ArgocdAWSAzure DevopsBashCi/CdCloudwatchDockerEksFluxcdGitKubernetesPowershellPythonSQLTerraform
14 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
110K-175K Annually
Senior level
110K-175K Annually
Senior level
Cloud • Software
In this role, you'll support large-scale applications, improve observability, mentor team members, and ensure reliability by collaborating on deployments and writing automation scripts while providing 24/7 support.
Top Skills: AnsibleAWSBashConfluenceDockerElk StackGCPGitlab CicdGrafanaJenkinsJIRAKubernetesLinuxMongoDBMySQLNagiosOciPerlPostgresPrometheusPuppetPythonTerraform
Reposted 14 Days AgoSaved
Easy Apply
Remote
Boston, MA
Easy Apply
170K-200K Annually
Senior level
170K-200K Annually
Senior level
Software
Lead SRE to define SRE strategy, architecture, and roadmap; design and operate containerized, compliant cloud environments; build observability, incident management, automation, and developer platform capabilities; mentor SRE team and collaborate with security, compliance, and product teams to ensure reliability at scale.
Top Skills: AWSAws MarketplaceAzureAzure MarketplaceGCPGoogle Cloud MarketplaceGrafanaKubernetesPrometheusTerraform
24 Days AgoSaved
In-Office
Boston, MA
81K-97K Annually
Mid level
81K-97K Annually
Mid level
Fintech • Payments
As a Site Reliability Engineer, you will monitor Azure Cloud systems, automate processes, respond to incidents, and collaborate with development teams to enhance reliability and performance.
Top Skills: AzureBashDockerElk StackGoGrafanaKubernetesPrometheusPythonSplunkTerraform
All Filters
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account