Get the job you really want.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Artificial Intelligence • Enterprise Web • Information Technology • Machine Learning • Mobile • Software • Analytics
As a Site Reliability Engineer, you'll enhance system stability and performance, manage alert quality, and ensure operational security while collaborating on engineering initiatives.
Top Skills:
Cloud TechnologiesGkeKubernetesNginx
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
As a Lead Site Reliability Engineer, you will enhance infrastructure reliability, scalability, and efficiency by leading automation projects, mentoring engineers, and developing software-driven infrastructure solutions. You will shape deployment strategies and monitor performance to support the organization's rapid growth.
Top Skills:
.NetAnsibleAWSC#ChefContainerdDockerGCPGoJavaKubernetesLinuxNutanixPythonTerraformVsphere
Artificial Intelligence • Big Data • Information Technology • Software
The Lead Site Reliability Engineer will oversee cloud infrastructure management, develop SRE processes, ensure FedRAMP compliance, and lead a team of engineers.
Top Skills:
AnsibleAWSAzureBashCloudFormationCrossplaneDockerGCPGitGitlabGoIds/IpsJenkinsKubernetesPythonSIEMTerraform
Fintech • Information Technology • Payments • Financial Services • Cryptocurrency
The Senior Site Reliability Engineer will manage the production environment for the FedNow Service, implementing monitoring tools and CI/CD automation, supporting technical operations, interfacing with internal stakeholders, and driving continuous improvement initiatives while ensuring system reliability and scalability.
3 Days Ago
Easy Apply
Easy Apply
Big Data • Fintech • Mobile • Payments • Financial Services
As a Senior Software Engineer in SRE, you will lead teams in building reliable backend systems, driving incident management, and fostering a culture of quality, while supporting product development and handling operational metrics.
Top Skills:
AWSKotlinKubernetesMySQLPython
Aerospace • Artificial Intelligence • Logistics • Machine Learning • Software • Transportation • Defense
Lead efforts to deliver the Flyways AI Platform through coding, deploying, and maintaining services in a secure cloud infrastructure, while managing complex systems and collaborating with teams.
Top Skills:
AWSCircleCIDockerGrafanaHelmJenkinsK8SPostgresPythonTerraform
3 Days Ago
Easy Apply
Easy Apply
Big Data • Fintech • Mobile • Payments • Financial Services
As a Staff Software Engineer in SRE, you will design and enhance backend systems, ensuring reliability and operational excellence while developing a culture of quality and mentorship within the team.
Top Skills:
AWSKotlinKubernetesMySQLPythonSpark
Big Data • Cloud • Software • Database
Lead the Fabric team as a Site Reliability Engineer, focusing on building resilient infrastructure for secure service communication, while overseeing team direction and addressing technical issues.
Top Skills:
AWSAzureBgpDnsGCPKubernetesTcp/IpTls/MtlsVpcs
Featured Jobs
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
As a Lead Site Reliability Engineer, you will enhance cloud infrastructure, automate operations, and troubleshoot complex production issues in a secure environment.
Top Skills:
AnsibleAWSBashChefDirect ConnectDockerGoKubernetesPuppetPythonRestRubyScalaSoapTlsTransit GatewayUnix/LinuxVpc
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Software
The SRE Cloud Architect will design and optimize AWS cloud infrastructure focusing on scalability, reliability, and cost efficiency, while mentoring teams and ensuring best practices in security and operational excellence.
Top Skills:
AnsibleApi GatewayAWSAws CdkAws CloudwatchAws GuarddutyBashCloudFormationCloudfrontCloudtrailDocumentdbEc2EksGitlabGrafanaLambdaLokiMimirPrometheusPythonRdsS3Secrets ManagerSecurity HubSsmTempoTerraform
Big Data • Cloud • Software • Database
The Lead Site Reliability Engineer will manage the Fabric team, ensuring secure communication infrastructure, guiding engineering practices, and participating in on-call support.
Top Skills:
AWSAzureBgpDnsGCPKubernetesSdnTcp/IpTls/Mtls
Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy
The Principal Site Reliability Engineer ensures SaaS products are fast and stable, focuses on automation, system monitoring, and collaborates with teams to improve product performance.
Top Skills:
C#,.Net,Java,Harness,Azure Devops,Ansible,Jenkins,New Relic,Dynatrace,Datadog,Appdynamics,Powershell,Python,Bash,Terrraform,Sql,Cosmos,Solarwinds Database Performance Analyzer,Idera Sql Diagnostic Manager,Redgate Sql Monitor,Kubernetes,Aks,Eks
12 Days Ago
Easy Apply
Easy Apply
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
The Lead Site Reliability Engineer will design, develop, and operate observability systems, ensuring service reliability in large distributed environments. Responsibilities include scaling observability systems, writing monitoring libraries, and collaborating with engineering teams.
Top Skills:
AnsibleBashElasticsearchGoKafkaPrometheusPythonRubyScalaTerraform
Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
As a Senior Site Reliability Engineer, you will enhance productivity by building foundational services, ensuring reliable systems, and mentoring team members.
Top Skills:
Apache PulsarAWSBashClickhouseDjangoGoKubernetesMySQLPythonRedisTerraform
Computer Vision • Healthtech • Information Technology • Logistics • Machine Learning • Software • Manufacturing
As a Senior Software Engineer II, you'll design scalable infrastructure to support Dandy's products, ensuring quality and performance in a collaborative environment.
Top Skills:
ChronosphereGCPGraphQLKubernetesNestjsNode.jsPostgresPulumiReactReduxTemporalTypescript
Cloud • Fintech • Cryptocurrency • NFT • Web3
The Staff Site Reliability Engineer at Coinbase will improve system reliability, mentor engineers, automate processes, and oversee software integrity, focusing on high-quality coding and performance tuning.
Top Skills:
AWSAzureDatadogDockerEc2GCPGoKibanaKubernetesRubyTerraform
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will design scalable systems, automate processes, and ensure high availability of the Atlas platform, collaborating with multiple teams.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Fintech • Information Technology • Payments • Financial Services • Cryptocurrency
As a Principal Engineer in the SRE/Production Operations team for FedNow, you will oversee production environments, implement monitoring and tooling, and ensure reliable, scalable systems. Responsibilities include CI/CD automation design, capacity planning, and collaborating with internal teams to manage technical operations and continuous improvement initiatives.
Artificial Intelligence • Big Data • Information Technology • Software
The Senior Site Reliability Engineer will build and manage cloud infrastructure, ensure security compliance, and lead incident management efforts while collaborating with various teams on performance optimization.
Top Skills:
AnsibleAWSAzureBashCloudFormationCrossplaneDockerFirewallsGCPGitGitlabGoIds/IpsJenkinsKubernetesPythonSIEMTerraform
Fintech • Payments • Financial Services
Lead the SRE and Production Operations team for FedNow, ensuring reliability of the production environment and managing CI/CD pipelines while collaborating with engineering teams.
Top Skills:
AnsibleAuroraAWSCloudwatchConsulDockerDynatraceEbsEc2EksElbGitlabGrafanaHashicorp TerraformIamLinuxOpensearchPrometheusPythonRdsRoute 53S3Vault
Artificial Intelligence • Software • Generative AI
The site reliability engineer will enhance and maintain Writer’s cloud infrastructure, ensuring reliability, scalability, and security while mentoring junior engineers.
Top Skills:
AWSAzureDockerElk StackGCPGoGrafanaJavaKubernetesPrometheusPythonTerraform
Big Data • Cloud • Software • Database
Seeking a Senior Site Reliability Engineer to support and maintain the MongoDB Atlas platform, focusing on automation, system design, and operational excellence.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Cloud • Greentech • Other • Energy
As a Site Reliability Engineer II on the Observability team, you'll manage and improve observability stacks, support engineering teams with monitoring, develop new tools, and analyze system performance for enhanced reliability.
Top Skills:
AnsibleCircleCICloud FormationDockerGithub ActionsGitlab Ci/CdGoKubernetesPythonTerraform
Legal Tech • Software
As a Site Reliability Engineer, you will develop autonomous systems, improve CI/CD processes, mentor junior engineers, and ensure reliable software operations.
Top Skills:
Artificial IntelligenceCi/CdCloud-Based Workflow ToolsInternet Scale ApplicationsMachine Learning
Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation
As a Site Reliability Engineer, you'll enhance infrastructure security, automate deployments, optimize CI/CD processes, and drive engineering best practices while ensuring compliance and observability.
Top Skills:
Aws CloudElasticsearchGoJavaScriptMongoDBNode.jsReactRedisTerraform
Popular Job Searches
All Filters
Total selected ()
No Results
No Results