Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Cloud • Software
The Site Reliability Engineer (SRE) will manage reliable, scalable systems, focusing on software development, infrastructure automation, and incident response. Responsibilities include monitoring, CI/CD pipeline management, security compliance, and cost optimization while collaborating with various teams.
Top Skills:
AWSAzureDockerElk StackGCPGitGrafanaJavaKubernetesPHPPrometheusPythonShellTerraform
Information Technology • Marketing Tech • Social Media
Lead platform engineering teams to design a secure and scalable hosting platform, integrating AI and automation while collaborating across departments.
Top Skills:
AIAnsibleAutomationAWSDockerGrafanaMlNginxNode.jsOpenstackPHPPrometheusTerraformWordpress
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
The Senior Director of SRE leads and defines reliability and operational excellence across products, manages the SRE team, and scales reliability practices within the organization.
Top Skills:
AWSAzureCloud-Native NetworkingDistributed SystemsGCPKubernetesMicroservicesSite Reliability Engineering Principles
Cloud • Security • Software • Cybersecurity
The Principal Site Reliability Engineer will lead Veeam's global SRE efforts, focusing on architecture, reliability strategies, and mentorship while influencing cross-functional teams.
Top Skills:
Automation ToolingCloud InfrastructureCloud-Native DevelopmentDistributed Systems
Payments
As a Principal Site Reliability Engineer, you'll architect scalable infrastructure, drive reliability, mentor engineers, and lead AI enablement efforts, ensuring high-performance across systems.
Top Skills:
AWSCi/CdDatadogElasticsearchGoGrafanaKubernetesNew RelicPrometheusPythonRds (Mysql/Postgres)Sql-Based RdbmsTypescript
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills:
ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Aerospace • Manufacturing
The Staff Site Reliability Engineer will design and manage Aalyria's centralized observability platform, focus on metrics, logging, and tracing systems, implement SLOs and SLIs, automate deployments, and drive incident response strategies for enhanced reliability across satellite and cloud platforms.
Top Skills:
AWSElkGCPGitopsGoGrafanaJaegerJavaKubernetesLokiOpentelemetryPrometheusPythonTempoTerraform
Logistics • Software
Seeking a Staff Site Reliability Engineer to enhance infrastructure reliability and performance using advanced engineering principles, primarily on Google Cloud Platform.
Top Skills:
AnsibleChefCloudFormationDatadogDockerElkFluentdGithub ActionsGitlab CiGoGoogle Cloud PlatformGrafanaJavaJenkinsKafkaKubernetesMySQLNew RelicPostgresPrometheusPub/SubPulumiPuppetPythonRedisSplunkTerraform
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves supporting network infrastructure, automating cloud services, deploying Kubernetes, managing CI/CD workflows, and ensuring cloud security best practices.
Top Skills:
AnsibleAWSBashChefDockerGitGoKubernetesPuppetPythonRubySaltTerraform
Cloud • Fintech • Information Technology • Software • Business Intelligence
As a Site Reliability Engineer, you will ensure production system reliability, optimize performance, respond to incidents, and collaborate on infrastructure improvements.
Top Skills:
AnsibleAWSBashDatadogDockerElkGitGrafanaKubernetesNew RelicOpentelemetryPrometheusPythonReactRubyRuby On RailsTerraform
Biotech
Seeking a Senior DevOps/SRE Engineer to enhance software release processes, coach teams on DevOps practices, and ensure system reliability and security across platforms.
Top Skills:
Amazon Web ServicesCi/CdDockerGitGitlab CiJavaJenkinsKubernetesMySQLPostgresPythonSQLTerraform
Healthtech
The Site Reliability Engineer will enhance software delivery processes, ensure system availability, automate tasks, manage incidents, and perform capacity planning.
Top Skills:
AWSAzureDatadogDockerGrafanaJavaScriptNew RelicPrometheusPythonRubySplunkTerraform
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Software
The role involves managing compute infrastructure for decentralized applications, requiring critical thinking, documentation skills, and experience in Kubernetes and blockchain management.
Top Skills:
BlockchainGitopsInfrastructure-As-CodeKubernetesProgramming Languages
Real Estate • Financial Services • PropTech
As a Site Reliability Engineer, you will support AWS Cloud products, optimize processes, enhance automation, and ensure system reliability and performance.
Top Skills:
ArgocdAWSAzure DevopsBashCi/CdCloudwatchDockerEksFluxcdGitKubernetesPowershellPythonSQLTerraform
Artificial Intelligence • eCommerce • Retail
Lead the SRE and DevOps team, ensure infrastructure reliability, oversee cloud operations, drive automation, and collaborate cross-functionally.
Top Skills:
AzureBashCi/CdDatadogDockerElk StackGoGrafanaKubernetesPowershellPrometheusPythonTerraform
Security • Software
The Staff Site Reliability Engineer will design and implement AWS architectures, lead automation for cloud infrastructures, and provide guidance on reliability and performance of SaaS environments.
Top Skills:
AnsibleAWSC#C++CloudFormationDatadogDockerEc2EksElkGrafanaInfluxdbJavaKubernetesPythonS3SaltTerraformVpcs
Cloud • Software • Database
Lead and scale a Site Reliability Engineering team, ensuring system reliability and performance across cloud-native databases, while collaborating with multiple teams.
Top Skills:
Automation ToolsCloud-Native TechnologiesInfrastructure As CodeMySQLObservation And Monitoring ToolsPostgres
Cloud • Security • Software
As a Staff Site Reliability Engineer, you will design and deliver solutions for cloud-based services, establish automated CI/CD pipelines, and support infrastructure with a focus on security and resiliency while participating in an on-call rotation.
Top Skills:
Ci/CdCloud PlatformsDockerGitGoKubernetes
Computer Vision • Information Technology • Machine Learning • Natural Language Processing • Real Estate • Software
The SRE will maintain infrastructure for SaaS products on AWS, support developers, manage platform components, and handle IT tasks.
Top Skills:
AWSComputer VisionIacLarge Language ModelsNlpTerraform
Logistics • Software • Transportation
Lead DevOps, SRE, and Database teams to build scalable Azure Cloud infrastructure, implement CI/CD pipelines, and drive automation and security practices.
Top Skills:
Ai-Driven ToolingAzure CloudAzure DevopsAzure MonitorCi/CdCosmosdbDockerElkGithub CopilotGrafanaKubernetesMySQLPostgresPrometheusRedisSQL ServerTerraform
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills:
Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Fintech • Mobile • Software
The Staff Site Reliability Engineer will design and manage AWS infrastructure, optimize Kubernetes operations, automate workflows, and troubleshoot systems for improved reliability and performance.
Top Skills:
AWSCi/CdDatadogDockerEksGithub ActionsGoKafkaKubernetesNginxPrivatelinkPythonTerraformTransit GatewayVpc
Fintech
As a Site Reliability Engineer 2, you will enhance the reliability of the Brokerage-as-a-Service platform, manage technical challenges, and lead automation efforts while participating in on-call rotations for incident resolution.
Top Skills:
Apache AirflowAWSCloudFormationConfluent CloudKubernetesOpenshiftPythonSQLTerraform
News + Entertainment
As an Ads Reliability Engineer, you will ensure the reliability and scalability of Netflix's Ad Suite by designing infrastructure, automating operations, and collaborating across teams to maintain system health and performance.
Top Skills:
AWSAzureGCPGoJavaKubernetesPythonTerraform
Edtech
The Lead Software Engineer will lead the SRE team, focusing on reliability, performance optimization, security, and mentoring developers, while improving overall platform resilience.
Top Skills:
ActivejobAnsibleAWSAws CloudwatchEc2EcsElasticsearchGitGCPGoogle Cloud StackdriverJenkinsJIRAKubernetesMemcachedMongoDBNew RelicNode.jsPostgresRedisRuby On RailsSidekiqSpinnakerTerraformTerragrunt
Popular Job Searches
All Filters
Total selected ()
No Results
No Results
.png)





.png)


























