Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Reposted 3 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills:
AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Reposted YesterdaySaved
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills:
AWSBashGoKubernetesPythonSlurmTerraform
Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Design, scale, and manage AWS services for IoT devices. Collaborate on infrastructure, optimize performance, and ensure high availability of services.
Top Skills:
AWSBashGoHelmKubernetesPythonRubyTerraform
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills:
AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
The Lead Site Reliability Engineer will oversee the Infrastructure SRE team, focusing on system reliability, automation, and mentoring while collaborating with product engineering.
Top Skills:
Ci/CdDatadogDockerElk StackGitopsGoKubernetesLinux/UnixNew RelicNoSQLPrometheusPythonSQLStackdriverTerraform
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Principal Software Engineer on the SRE team, lead best practices adoption, mentor engineers, and improve system reliability and user experience through automation and collaboration.
Top Skills:
CdkCloudFormationDatadogGoJavaScriptPrometheusPythonTerraformTypescript
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Enterprise Web • Hardware • Internet of Things • Software
The Senior Site Reliability Engineer will mentor teams on observability practices, architect systems for growth, automate developer tasks, and debug production issues.
Top Skills:
GoKubernetesLgtm StackOpentelemetryPrometheusTypescript
Reposted 12 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills:
AnsibleAws EcsKubernetesLinuxPythonTerraform
Information Technology • Web3
The Site Reliability Engineer manages AWS Kubernetes infrastructure, ensuring operational excellence, security, and scalability, while implementing reliability improvements and best practices.
Top Skills:
ArgocdAWSBashDatadogEksGoKafkaKubernetesPostgresPythonSysdigTerraform
14 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The Staff Site Reliability Engineer will lead AI-driven innovations, automate cloud infrastructure, implement CI/CD frameworks, and maintain operational IT support at Coinbase.
Top Skills:
AnsibleAWSBashChefCi/CdDockerGitGoKubernetesPuppetPythonRubySaltTerraform
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
As a Site Reliability Engineer, you will ensure the stability of Runpod's platform by defining reliability standards, enhancing observability, and automating processes to reduce operational toil.
Top Skills:
BashGoGrafanaLinuxNetworkingPrometheusPython
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
The Site Reliability Engineer will ensure system reliability and scalability, manage infrastructure, automate tasks, and collaborate cross-functionally while mentoring junior engineers and supporting production environments.
Top Skills:
AnsibleArgocdBashDatadogGithub ActionsGitlabGoHashicorp ConsulHelmKubernetesPackerPostgresPowershellPythonSQL ServerTerraformTypescript
Reposted 16 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills:
AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
Artificial Intelligence • Cloud • Social Impact • Software • Wearables
As a Staff Site Reliability Engineer, you will design and implement Axon's core platforms, focusing on automation, security, and compliance, while collaborating across teams to enhance cloud operations.
Top Skills:
CdkCloudFormationGoJavaPythonTerraformTypescript
Artificial Intelligence • Insurance • Software • Automation
The Staff Site Reliability Engineer will build and scale infrastructure for Assured's platform, automate delivery, enhance observability, and lead mentoring initiatives.
Top Skills:
AWSKubernetesPostgresTerraform
Information Technology • Software
The Site Reliability Engineer will enhance continuous integration and delivery processes, manage multi-cloud environments, and mentor teams in deploying microservices solutions.
Top Skills:
AnsibleArgo CdAWSAzureBash ScriptingChefCloudFormationDockerFleetFlux CdGitGrafanaHelmIstioKubernetesMySQLOraclePostgresPrometheusPuppetSQL ServerTerraformVMware
Software
Lead SRE to define SRE strategy, architecture, and roadmap; design and operate containerized, compliant cloud environments; build observability, incident management, automation, and developer platform capabilities; mentor SRE team and collaborate with security, compliance, and product teams to ensure reliability at scale.
Top Skills:
AWSAws MarketplaceAzureAzure MarketplaceGCPGoogle Cloud MarketplaceGrafanaKubernetesPrometheusTerraform
Software • Consulting
As a Senior Application Support Engineer, you will ensure application reliability, manage incidents, and collaborate with teams to enhance performance and support processes.
Top Skills:
AppdynamicsAWSDatadogLinuxMulesoftOpentelemetryPythonSplunk
Information Technology • Software
As a DevOps/Site Reliability Engineer, you will manage cloud infrastructure, CI/CD pipelines, and improve system reliability and performance while supporting AI data pipelines.
Top Skills:
AWSDatadogEc2EksGithub ActionsGoGrafanaIamKubernetesPrometheusPythonRdsS3Terraform
Healthtech • Other • Software
The role involves managing PostgreSQL services, ensuring high availability and performance, driving incident response, automating tasks, and improving observability for a 24x7 SaaS platform.
Top Skills:
AnsibleBashDatadogGrafanaHaproxyNew RelicPgbackrestPgbouncerPostgresPowershellPrometheusPythonRepmgrTerraform
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills:
AWSKubernetesTerraformTerragrunt
Big Data • Machine Learning • Software • Analytics
As a Lead Site Reliability Engineer, you will drive the reliability strategy, improve system health, lead incident management, and mentor engineers for a multi-region SaaS platform.
Top Skills:
ArgocdC++Ci/CdCloud PlatformsDatadogGitopsGrafanaInfrastructure As CodeJavaJavaScriptKubernetesPython
Computer Vision • Information Technology • Machine Learning • Natural Language Processing • Real Estate • Software
The SRE will maintain infrastructure for SaaS products on AWS, support developers, manage platform components, and handle IT tasks.
Top Skills:
AWSComputer VisionIacLarge Language ModelsNlpTerraform
Artificial Intelligence • Other • Sales • Software
The role involves designing and advancing infrastructure for the engineering team, ensuring the reliability of Kubernetes clusters, automating operations, and building machine learning infrastructure.
Top Skills:
ArgoAWSAzureCloudFormationFluxGithub ActionsGoGCPKubernetesPostgresPythonTerraform
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Popular Job Searches
All Filters
Total selected ()
No Results
No Results



.png)




.png)

.png)




















