Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills:
Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Software
As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.
Top Skills:
AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
Edtech
The Lead Software Engineer will lead the SRE team, focusing on reliability, performance optimization, security, and mentoring developers, while improving overall platform resilience.
Top Skills:
ActivejobAnsibleAWSAws CloudwatchEc2EcsElasticsearchGitGCPGoogle Cloud StackdriverJenkinsJIRAKubernetesMemcachedMongoDBNew RelicNode.jsPostgresRedisRuby On RailsSidekiqSpinnakerTerraformTerragrunt
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
The Senior Site Reliability Engineer will manage system incidents, improve monitoring and logging, optimize database infrastructure, and collaborate on scaling systems efficiently.
Top Skills:
AWSClickhouseKubernetesMySQLPostgresRedis
Healthtech
Develop and implement processes to ensure high availability and reliability of services. Responsibilities include incident management, automation, capacity planning, and risk mitigation.
Top Skills:
AWSAzureDatadogDockerGrafanaJavaScriptNew RelicPrometheusPythonRubySplunkTerraform
Internet of Things • Cybersecurity
The Site Reliability Engineer will manage AWS GovCloud infrastructure, ensuring compliance and high availability while driving automation, security, and incident response best practices.
Top Skills:
AnsibleAws GovcloudBashDockerElk StackGitlab Ci/CdGrafanaJenkinsKubernetesPrometheusPythonTerraform
Healthtech • Software
The SRE Technical Project Manager will lead project delivery, incident management, automation processes, and uptime communication, partnering with SRE and development teams to ensure system stability and scalability.
Top Skills:
Ai BotsDatadogJIRAJira Service ManagementMs TeamsOpsgeniePagerduty
Artificial Intelligence • Healthtech • Software
Design, build, and maintain secure and scalable infrastructure for critical healthcare applications, lead incident responses, and support engineering teams.
Top Skills:
BashGCPGoGrafanaHelmKubernetesPrometheusPythonTerraform
Software • Cryptocurrency
Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.
Top Skills:
Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform
Cloud • Information Technology
As a Staff Site Reliability Engineer, you will enhance cloud product lines, ensuring real-time scalability, collaborating with teams, and automating builds.
Top Skills:
AnsibleAWSAzureBashDnsDockerEnvoyGCPGitGoGrafanaHaproxyHTTPJenkinsKafkaKubernetesLinuxMySQLOciOpentelemetryPostgresPrometheusPuppetPythonRedisTcp/IpTelegrafTerraformTls
Software
As a Site Reliability Engineer, you will enhance system reliability, manage cloud services, respond to incidents, and support network systems.
Top Skills:
AutomationCisco RoutingCloud ServicesF5 Load BalancingFortinet FirewallsInfrastructure AutomationMonitoringNetworking
Cloud • Security • Cybersecurity
As a Junior Site Reliability Engineer, you will support cloud operations, implement automation for cloud infrastructure, and ensure system reliability and security.
Top Skills:
AnsibleAWSAzureBashElastic StackGCPJIRAPowershellPythonServicenowSplunkTerraform
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Healthtech • Information Technology • Telehealth
Lead Site Reliability Engineer responsible for ensuring cloud services reliability, automation, and performance while mentoring a team and collaborating cross-functionally. Drive initiatives to enhance incident management and enforce security compliance.
Top Skills:
AnsibleAWSAws CloudformationAzureBashDatadogDockerElk StackGoGCPGrafanaKubernetesPrometheusPuppetPythonTerraform
Big Data • Cloud • Software • Analytics
As a Site Reliability Engineering Intern, you'll monitor cloud services, assist in incident management, support automation, and collaborate with engineers to improve system reliability.
Top Skills:
AWSAzureDatabasesGCPGitGrafanaKafkaLinux/Unix SystemsPrometheusPythonShell ScriptingTerraform
Cloud • Information Technology
As a Staff Platform Engineer, you'll develop and maintain infrastructure components using Go and Node.js, improve service reliability, mentor juniors, and manage data ecosystems.
Top Skills:
EnvoyExpressGoJenkinsKafkaMySQLNode.jsPostgresPuppetPythonReactRedis
Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
As a Senior Site Reliability Engineer, you will ensure software reliability and scalability, manage IAC, CI/CD, monitor systems, and mentor junior engineers while collaborating across teams.
Top Skills:
AnsibleArgocdBashDatadogGithub ActionsGitlabGoHashicorp ConsulHelmKubernetesPackerPostgresPowershellPythonSQL ServerTerraformTypescript
AdTech • Marketing Tech
The Senior Software Engineer for Core Services SRE will maintain infrastructure, develop reliable systems, lead technical initiatives, and conduct security reviews.
Top Skills:
AerospikeAWSBoundaryConsulElasticsearchEnvoyGoGrafanaKafkaNginxNomadPackerPrometheusRdsRedisScylladbTerraformVagrantVaultWaypoint
Cloud • Software • Analytics
Join Arista Networks as a Site Reliability Engineer to manage CloudVision service reliability, scalability, and stability in a FedRAMP environment, focusing on areas like architecture, security, and performance optimization.
Top Skills:
AnsibleBashGCPGkeGoKubernetesPulumiPython
Software
As a Senior Site Reliability Engineer at Regrello, you'll shape the developer platform, collaborate with customers, and ensure the reliability and security of infrastructure and applications.
Top Skills:
AWSAzureCircleCIGCPGithub ActionsGitlab CiGoKubernetesTerraform
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
Lead architecture and build reliability platforms, drive AIOps automation, champion SRE practices, lead incident response and postmortems, advance observability, and mentor engineers to improve system reliability and performance.
Top Skills:
AiopsAWSAzureContinuous ProfilingDatadogDnsElkGCPGoGrafanaHttp/SKubernetesLoad BalancingOpentelemetryPrometheusPythonTcp/Ip
Information Technology • Other • Software • Consulting
The Site Reliability Engineer at CardioOne will enhance the reliability and performance of production systems, implement automation, and lead incident response efforts while collaborating with development teams.
Top Skills:
AnsibleAWSAzureDatadogDockerEcsJavaKubernetesPythonTerraformTerragrunt
Fitness
The Staff Site Reliability Engineer will establish SRE best practices, drive observability strategy, implement software solutions, and mentor engineers. Responsibilities include improving platform resilience, managing risks, and participating in incident response processes.
Top Skills:
AnsibleAWSAzureBashCloudFormationGCPGoKubernetesPulumiPythonTerraform
Artificial Intelligence • Information Technology • Software • Generative AI • Automation
The Senior Site Reliability Engineer at Blitzy will ensure system reliability, scalability, and operational excellence for an AI software platform, engaging in infrastructure design, CI/CD pipeline management, and observability tools.
Top Skills:
AWSBashDatadogGoGrafanaKubernetesOpentelemetryPrometheusPythonTerraform
HR Tech • Software
The Site Reliability Engineer will architect and manage AWS infrastructure, implement CI/CD pipelines, lead incident responses, and mentor junior engineers to maintain reliability and security for a B2B SaaS platform.
Top Skills:
AlbAWSBashCloudFormationDatadogEcsGitGoGuarddutyJenkinsLambdaPythonS3Terraform
Information Technology • Cryptocurrency
The Site Reliability Engineer will lead technical initiatives, architect solutions, troubleshoot issues, mentor team members, and improve observability practices.
Top Skills:
ArgocdBashElk StackGCPGoGrafanaHelmKubernetesPrometheusPythonTerraform
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Popular Job Searches
All Filters
Total selected ()
No Results
No Results











.png)

.png)
_0.png)

















