Get the job you really want.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Database
As a Site Reliability Engineer, you will manage Postgres infrastructure, ensuring reliability, implementing high availability, and enhancing performance through observability and automation. You will also design CI/CD systems and oversee on-call responsibilities.
Top Skills:
AWSBarmanGoInfrastructure As CodePatroniPgbackrestPostgresStolon
Cloud • Security • Software
Design and deliver solutions for cloud-based services, ensuring automated deployments, infrastructure security, and high-quality software in a collaborative team environment.
Top Skills:
DockerGoKubernetes
15 Days Ago
Easy Apply
Easy Apply
Cloud • Security • Software • Cybersecurity • Automation
The Senior Site Reliability Engineer is responsible for maintaining user-facing services, managing database operations, and optimizing cloud infrastructure at GitLab. Key responsibilities include designing and maintaining ClickHouse and PostgreSQL clusters, implementing monitoring systems, and ensuring security compliance. The role requires strong technical skills in database management and cloud automation, along with leadership and communication abilities.
Top Skills:
AnsibleChefClickhouseGoGrafanaHelmKubernetesLinuxPostgresPrometheusPythonRubyTerraform
Cloud
Lead multiple engineering teams to design and automate cloud-native platforms, ensuring system reliability and managing technical challenges in SaaS operations.
Top Skills:
AWSC#DockerEcsGoJavaScriptKubernetesPythonTerraform
Blockchain • Web3
As a Site Reliability Engineer at Syndica, you will maintain blockchain infrastructure, ensure reliability and performance, and utilize monitoring tools. You’ll work with teams to enhance system security and automate processes.
Top Skills:
AnsibleAWSAzureChefDatadogDockerElkGCPGoGrafanaJmeterK6KubernetesLocustNew RelicPrometheusPythonRustShellTerraformTypescript
Cloud • Information Technology
As a Staff Site Reliability Engineer, you will manage core infrastructure, improve reliability, automate operations, and support engineering teams in a remote environment.
Top Skills:
ElkEnvoyGoGrafanaGrpcHaproxyHashicorp NomadHoneycombJenkinsKafkaLinuxMySQLNode.jsPostgresPuppetRedis
Information Technology • Consulting
As a Site Reliability Engineer, you will manage customer systems, troubleshoot issues, and enhance system performance while ensuring effective communication with clients and teams.
Top Skills:
AnsibleApacheAWSAzureChefDockerGCPGitHaproxyKubernetesLinuxPostgresTerraform
Other • Social Impact
The Staff SRE will design and maintain ML infrastructure, improve scalability and reliability, optimize system performance, and mentor team members.
Top Skills:
AnsibleArgo CdDockerElk StackGpuGrafanaHelmKubernetesMachine LearningPrometheusPythonPyTorchScikit-LearnTensorFlowTerraform
Featured Jobs
Artificial Intelligence • Marketing Tech • Mobile • Software
Design and implement systems to enhance platform reliability and scalability. Lead initiatives, mentor team members, and collaborate across teams to drive impactful projects and establish best practices.
Top Skills:
AirflowAWSCloudflareDatadogDynamoDBEksEsbuildGradleGraphQLHelmHuggingfaceIstioJavaKinesisKubernetesMetaflowPandasPlanetscalePlaywrightPostgresPythonPyTorchRadix UiReactRedisSpring BootStorybookTensorFlowTerraformTypescriptVite
Fitness • Healthtech
As a Site Reliability Engineer, you will enhance systems, engage with stakeholders, mentor the team, and optimize development processes.
Top Skills:
DjangoGitlab CiGoogle Cloud PlatformKubernetesNuxt.JsPostgresRabbitMQReactRedisVue
Artificial Intelligence • Software • Generative AI
Lead the design and management of cloud infrastructure ensuring reliability and performance. Mentor junior engineers and automate cloud operations.
Top Skills:
AWSAzureDockerElk StackGCPGoGrafanaJavaKubernetesPrometheusPythonTerraform
Fintech
Ensure the reliability of critical services, collaborating with engineering to set SLOs and implement best practices while driving incident management and mentoring team members.
Top Skills:
AppdynamicsDatadogGoGrafanaPrometheusPythonSplunkTypescript
Software
The Site Reliability Engineer will ensure system availability and scalability while collaborating with various teams on infrastructure strategies, incident response, and performance optimization.
Top Skills:
AnsibleAWSAzureCircleCIDockerGithub CiJavaJavaScriptJenkinsKubernetesMongoDBPythonSQLTerraform
Generative AI
The Senior Site Reliability Engineer at Stability AI will enhance and manage cloud infrastructure, enforce SRE best practices, architect scalable systems, and drive incident management. Responsibilities include collaborating with development teams, implementing infrastructure as code, and mentoring junior team members.
Travel
Lead Site Reliability Engineer role responsible for architecture, implementation, and maintenance of scalable systems. Collaborate with teams to ensure high availability and security in operations.
Top Skills:
AWSBackboneChefDatadogGitJavaJavaScriptJqueryMongoDBNoSQLPrometheusReactRequirejsTerraform
Artificial Intelligence • Computer Vision
As a Site Reliability Engineer, you will build and maintain scalable infrastructure, develop automation solutions, deploy applications, manage observability, and optimize Linux systems while collaborating with engineering teams.
Top Skills:
AnsibleAWSAzureDockerGCPLinuxPrometheusPythonTerraform
Aerospace • Artificial Intelligence • Machine Learning • Robotics • Software
Optimize cloud deployments, manage internal Hivemind instances, enable scalability, and support customer deployments as a Cloud SRE/DevOps Engineer.
Top Skills:
ArgocdAWSAzureBashCloudFormationFluxcdGCPGoGrafanaHelmKubernetesPostgresPrometheusPythonTerraform
eCommerce • Software
As a Staff Site Reliability Engineer at Netlify, you will drive strategies for reliability and scalability of their infrastructure, lead cross-organizational initiatives, mentor engineers, and develop frameworks for operational excellence. You'll be the technical authority during major incidents and work closely with stakeholders to integrate reliability considerations across the organization.
Hardware • Logistics • Software
As a Senior Site Reliability Engineer, you'll develop reliable warehouse automation applications, lead observability practices, and collaborate on cloud infrastructure decisions.
Top Skills:
ArgoAzureDatadogFluxGrafanaHelmKubernetesNew RelicPrometheusSentryTerraform
Information Technology • Software • Consulting
As a Senior Site Reliability Engineer, you'll manage cloud infrastructure, ensuring uptime and optimizing performance through automation and reliability practices. You'll collaborate with engineering teams on microservices, develop CI/CD solutions, and educate staff on best practices, while participating in a 24x7 on-call rotation.
Top Skills:
Amazon Web Services (Aws)AnsibleBashConsulGitGitlabGoGoogle Cloud Platform (Gcp)GroovyHelmJenkinsKubernetesPackerPythonSaltTerraformVault
Software
This role involves designing scalable systems, automating infrastructure, managing incidents, optimizing performance, ensuring security compliance, and mentoring junior engineers within an SRE team.
Top Skills:
AnsibleAWSAzureContainerdCri-ODockerGCPKubernetesTerraform
Software • Cybersecurity
As a Senior Site Reliability Engineer at Bitwarden, you'll manage cloud infrastructure, optimize reliability and security, and lead incident response efforts while mentoring peers.
Top Skills:
C#GitGitopsGoKubernetesPulumiPythonTerraform
Artificial Intelligence • Software • Generative AI
This role involves designing, implementing, and maintaining cloud infrastructure, ensuring system reliability, mentoring junior engineers, and optimizing system performance and security.
Top Skills:
AWSAzureDockerElk StackGCPGoGrafanaJavaKubernetesPrometheusPythonTerraform
Edtech
The Senior DevOps Engineer will design and implement infrastructure automation, ensure site reliability, and support application deployment while mentoring junior staff.
Top Skills:
AnsibleAWSBashCentosDockerGoGroovyJavaJenkinsLinuxLxcNode.jsPHPPythonRubyTerraform
Security • Software
The Senior Site Reliability Engineer ensures system reliability, implementing automation, monitoring system availability, and collaborating with development teams on service objectives.
Top Skills:
AWSCircleCIGCPGithub ActionsGoGrafanaKubernetesPrometheusPythonTerraform
Top Boston Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results