Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Artificial Intelligence • Fintech • Software • Financial Services
Seeking a seasoned SRE to lead reliability for a cloud-native platform, overseeing infrastructure, CI/CD pipelines, observability, and mentoring engineers.
Top Skills:
AWSClickhouseGoJavaKafkaKubernetesPulumiTerraform
Fitness
The Staff Site Reliability Engineer will establish SRE best practices, drive observability strategy, implement software solutions, and mentor engineers. Responsibilities include improving platform resilience, managing risks, and participating in incident response processes.
Top Skills:
AnsibleAWSAzureBashCloudFormationGCPGoKubernetesPulumiPythonTerraform
Software • Analytics
This SRE role involves deep ownership of production systems, focusing on improving AWS infrastructure, operational tooling, and automation for scaling ClickHouse installations at petabyte scale.
Top Skills:
AnsibleAWSClickhouseEc2LinuxTerraform
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills:
AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Cloud • Information Technology • Biotech
The Site Reliability Engineer will build and deploy Linux servers, research technologies, monitor system performance, and resolve technical incidents.
Top Skills:
Infrastructure-As-CodeLinuxNetworkingVirtualization
Artificial Intelligence • Big Data • Machine Learning • Software
The Site Reliability Engineer will develop and maintain platform services using Go and Python, improve CI/CD pipelines, and manage applications on Kubernetes while collaborating on infrastructure automation and troubleshooting services.
Top Skills:
AWSAzureDockerGCPGitGitGoJenkinsKubernetesPostgresPython
Fintech • Analytics • Financial Services
The Site Reliability Engineer will enhance system reliability, implement observability tools, and collaborate with teams to improve SaaS applications.
Top Skills:
AWSAzureAzure DevopsBashDatadogGoNew RelicPowershellPrometheusPythonTerraform
Aerospace • Artificial Intelligence • Logistics • Machine Learning • Software • Transportation • Defense
Lead efforts to deliver the Flyways AI Platform, deploying and maintaining secure cloud services, coding software solutions, and collaborating with teams.
Top Skills:
AWSDockerGrafanaHelmK8SPostgresPythonTerraform
Artificial Intelligence • Information Technology • Software • Generative AI
The Site Reliability Engineer will ensure the reliability and performance of SaaS production systems, manage deployments and incident responses, and improve operational processes within a dynamic AI environment.
Top Skills:
AWSAzureBashDockerElkGCPGitGoGrafanaKubernetesPrometheusPulumiPythonTerraform
3D Printing • Artificial Intelligence • Software • Design
The role involves building reliable platforms for 3D/4D content delivery to AR/VR devices, monitoring system health, and improving operational practices in collaboration with the engineering team.
Top Skills:
Aws FargateCoreweaveGrafanaKubernetesPrometheusTerraform
Information Technology • Cryptocurrency
The Site Reliability Engineer will lead technical initiatives, architect solutions, troubleshoot issues, mentor team members, and improve observability practices.
Top Skills:
ArgocdBashElk StackGCPGoGrafanaHelmKubernetesPrometheusPythonTerraform
Gaming • Mobile • Software
As an SRE Manager, you will lead a team to enhance infrastructure services, manage incidents, and contribute to technical decisions while ensuring high availability and scalability of systems.
Top Skills:
Amazon AwsAnsibleArtifactoryCrossplaneDatadogElasticsearchGitlabGoGCPJaegerJenkinsKubernetesAzureMongoDBPackerPostgresPythonRedisTerraformVault
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Blockchain • Software
As a Site Reliability Engineer at Offchain Labs, you will manage infrastructure in cloud environments, design CI/CD workflows, and enhance system reliability with a focus on blockchain technology.
Top Skills:
ArgocdAWSAzureCodebuildGCPGithub ActionsGoGrafanaKubernetesLokiPrometheusPythonTerraform
Cloud • Information Technology
The Site Reliability Engineer will support IaaS services, monitor infrastructure health, perform root cause analysis, automate processes, and collaborate with teams for service reliability.
Top Skills:
AnsibleAWSAzureBashGitlab CiJenkinsKubernetesLinuxOpenshiftPythonTerraformVmware Vsphere
Fintech
As a Site Reliability Engineer, you will enhance system reliability through scalable infrastructure, observability practices, automation, and collaboration with engineering teams.
Top Skills:
AWSDatadogGoGrafanaJavaKubernetesNode.jsPrometheusPulumiPythonTerraform
7 Days AgoSaved
Easy Apply
Easy Apply
Analytics
The Site Reliability Engineer will ensure the reliability and performance of IaaS services, perform incident resolution, and enhance system reliability through automation while supporting mobility across hybrid infrastructures and collaborating extensively with various teams.
Top Skills:
AnsibleAWSAzureBashGitlab CiJenkinsKubernetesLinuxOpenshiftPythonTerraformVmware Vsphere
Cloud • Software
The Site Reliability Engineer (SRE) will manage reliable, scalable systems, focusing on software development, infrastructure automation, and incident response. Responsibilities include monitoring, CI/CD pipeline management, security compliance, and cost optimization while collaborating with various teams.
Top Skills:
AWSAzureDockerElk StackGCPGitGrafanaJavaKubernetesPHPPrometheusPythonShellTerraform
Information Technology • Marketing Tech • Social Media
Lead platform engineering teams to design a secure and scalable hosting platform, integrating AI and automation while collaborating across departments.
Top Skills:
AIAnsibleAutomationAWSDockerGrafanaMlNginxNode.jsOpenstackPHPPrometheusTerraformWordpress
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
The Senior Director of SRE leads and defines reliability and operational excellence across products, manages the SRE team, and scales reliability practices within the organization.
Top Skills:
AWSAzureCloud-Native NetworkingDistributed SystemsGCPKubernetesMicroservicesSite Reliability Engineering Principles
Cloud • Security • Software • Cybersecurity
The Principal Site Reliability Engineer will lead Veeam's global SRE efforts, focusing on architecture, reliability strategies, and mentorship while influencing cross-functional teams.
Top Skills:
Automation ToolingCloud InfrastructureCloud-Native DevelopmentDistributed Systems
Payments
As a Principal Site Reliability Engineer, you'll architect scalable infrastructure, drive reliability, mentor engineers, and lead AI enablement efforts, ensuring high-performance across systems.
Top Skills:
AWSCi/CdDatadogElasticsearchGoGrafanaKubernetesNew RelicPrometheusPythonRds (Mysql/Postgres)Sql-Based RdbmsTypescript
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills:
ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Aerospace • Manufacturing
The Staff Site Reliability Engineer will design and manage Aalyria's centralized observability platform, focus on metrics, logging, and tracing systems, implement SLOs and SLIs, automate deployments, and drive incident response strategies for enhanced reliability across satellite and cloud platforms.
Top Skills:
AWSElkGCPGitopsGoGrafanaJaegerJavaKubernetesLokiOpentelemetryPrometheusPythonTempoTerraform
Logistics • Software
Seeking a Staff Site Reliability Engineer to enhance infrastructure reliability and performance using advanced engineering principles, primarily on Google Cloud Platform.
Top Skills:
AnsibleChefCloudFormationDatadogDockerElkFluentdGithub ActionsGitlab CiGoGoogle Cloud PlatformGrafanaJavaJenkinsKafkaKubernetesMySQLNew RelicPostgresPrometheusPub/SubPulumiPuppetPythonRedisSplunkTerraform
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves supporting network infrastructure, automating cloud services, deploying Kubernetes, managing CI/CD workflows, and ensuring cloud security best practices.
Top Skills:
AnsibleAWSBashChefDockerGitGoKubernetesPuppetPythonRubySaltTerraform
Top Boston Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results














.png)





.png)









