Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Reposted 6 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will support, maintain and grow the Atlas platform, focusing on automating processes and running multi-cloud environments.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 3 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The Site Reliability Engineer will enhance CI/CD frameworks, automate cloud infrastructure, manage Kubernetes and AWS services, and ensure operational excellence.
Top Skills:
AnsibleAWSBashChefCi/CdDockerGitKubernetesPuppetPythonRubySaltTerraform
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills:
AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
The Lead Site Reliability Engineer will oversee the Infrastructure SRE team, focusing on system reliability, automation, and mentoring while collaborating with product engineering.
Top Skills:
Ci/CdDatadogDockerElk StackGitopsGoKubernetesLinux/UnixNew RelicNoSQLPrometheusPythonSQLStackdriverTerraform
Reposted 5 Days AgoSaved
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills:
AWSBashGoKubernetesPythonSlurmTerraform
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead SRE responsible for architecting and automating fault-tolerant, scalable infrastructure across cloud and on-prem, driving deployment, monitoring, and performance tuning while mentoring engineers to improve reliability and SLAs.
Top Skills:
.NetAnsibleAWSAws GreengrassC#ChefDockerElixirGCPGitopsGoJavaKubernetesLinuxNutanixPythonRubyTerraformVsphere
Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Design, scale, and manage AWS services for IoT devices. Collaborate on infrastructure, optimize performance, and ensure high availability of services.
Top Skills:
AWSBashGoHelmKubernetesPythonRubyTerraform
Enterprise Web • Hardware • Internet of Things • Software
The Senior Site Reliability Engineer will mentor teams on observability practices, architect systems for growth, automate developer tasks, and debug production issues.
Top Skills:
GoKubernetesLgtm StackOpentelemetryPrometheusTypescript
Reposted 22 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
This role involves building and maintaining observability services, ensuring service reliability, and collaborating with other teams on best practices.
Top Skills:
AWSFluentbitGCPJaegerKubernetesAzureQuickwitSplunkVectorVictoriametrics
Reposted 13 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills:
AnsibleAws EcsKubernetesLinuxPythonTerraform
Information Technology • Web3
The Site Reliability Engineer manages AWS Kubernetes infrastructure, ensuring operational excellence, security, and scalability, while implementing reliability improvements and best practices.
Top Skills:
ArgocdAWSBashDatadogEksGoKafkaKubernetesPostgresPythonSysdigTerraform
AdTech • eCommerce • Food • Marketing Tech • Retail
The Senior Site Reliability Engineer is responsible for ensuring production system reliability, scalability, and performance through automation, monitoring, and infrastructure engineering. The role includes mentoring junior engineers and managing production environments, while collaborating with engineering teams to improve system resilience.
Top Skills:
AksArgocdAWSAzureBashDatadogDockerElkGCPGithub ActionsGoJavaKafkaKubernetesPrometheusPythonRedisSpring BootTerraformTomcat
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
18 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills:
AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills:
AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Artificial Intelligence • Information Technology • Software • Generative AI
The Site Reliability Engineer will ensure the reliability and performance of SaaS production systems, manage deployments and incident responses, and improve operational processes within a dynamic AI environment.
Top Skills:
AWSAzureBashDockerElkGCPGitGoGrafanaKubernetesPrometheusPulumiPythonTerraform
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills:
AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Fintech • Payments
Lead an SRE team to improve reliability, observability, and automation across Azure. Define SLOs/SLIs, manage error budgets and on-call, standardize IaC/GitOps, build self-healing systems, and collaborate on cost, governance, and operational strategy.
Top Skills:
AzureAzure MonitorAzure PoliciesBashBicepGithub ActionsGitopsGoGrafanaIacKqlLog AnalyticsPowershellPrometheusPythonTerraform
Artificial Intelligence • Cloud • Social Impact • Software • Wearables
As a Staff Site Reliability Engineer, you will design and implement Axon's core platforms, focusing on automation, security, and compliance, while collaborating across teams to enhance cloud operations.
Top Skills:
CdkCloudFormationGoJavaPythonTerraformTypescript
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer, you will troubleshoot production issues, automate systems, define database requirements, and collaborate with Dev and QA teams for stability.
Top Skills:
AnsibleCassandraChefNoSQLPythonRedis
Information Technology • Productivity • Software • Infrastructure as a Service (IaaS)
The Senior Site Reliability Engineer will improve service availability, troubleshoot complex issues, and enhance automation using various tools while ensuring security and best practices.
Top Skills:
AnsibleAWSC++CloudFormationDatadogGoHelmJavaKotlinKubernetesLinuxNew RelicPostgresSplunkTerraform
Logistics • Software • Transportation
Lead and mentor teams in DevOps and SRE, architect scalable Azure Cloud infrastructure, implement CI/CD and IaC, ensure database reliability, and drive cross-functional collaboration.
Top Skills:
Azure CloudAzure DevopsCi/CdCosmosdbDockerElkGrafanaKubernetesMySQLPostgresPrometheusRedisSQL ServerTerraform
Software
The role involves managing compute infrastructure for decentralized applications, requiring critical thinking, documentation skills, and experience in Kubernetes and blockchain management.
Top Skills:
BlockchainGitopsInfrastructure-As-CodeKubernetesProgramming Languages
Healthtech • Software
Maintain reliability, performance, and scalability of cloud-hosted services and databases. Implement SRE best practices, define SLIs/SLOs, respond to incidents, build monitoring and automation, perform DBA tasks (backups, restores, tuning), support CI/CD and DB migrations, and document runbooks and procedures.
Top Skills:
Amazon RdsAzure Sql DatabaseBashEcs FargateFlywayGitlabJenkinsKubernetesLiquibaseOctopus DeployOraclePostgresPowershellPythonRedisSolarwinds DpaSQL Server
Software
Join the SRE team to improve monitoring, alerting, observability, and reliability of Fireblocks' production systems. Triage incidents, run RCA, create runbooks and automation (Python, Lambda, shell, Ansible, ArgoCD), collaborate with R&D/support, and participate in on-call rotation.
Top Skills:
AnsibleArgocdAWSAws LambdaAzureBashBitbucketC++ChefCoralogixDatadogDockerGerritGitGitlabGCPHelmJavaScriptKubernetesLinuxMySQLNew RelicNginxNode.jsPhabricatorPrometheusPuppetPythonShellSplunk
Artificial Intelligence • eCommerce • Retail
Lead the SRE and DevOps team, ensure infrastructure reliability, oversee cloud operations, drive automation, and collaborate cross-functionally.
Top Skills:
AzureBashCi/CdDatadogDockerElk StackGoGrafanaKubernetesPowershellPrometheusPythonTerraform
Top Boston Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results

.png)
.png)



























