Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Boston, MA
AdTech • eCommerce • Food • Marketing Tech • Retail
The Principal Site Reliability Engineer will design and lead site reliability practices, ensuring system resiliency and operational excellence, while mentoring junior staff and driving large-scale reliability initiatives.
Top Skills:
AksArgocdDatadogDockerGithub ActionsJavaKubernetesPythonRedisSpring BootTerraformTomcat
Reposted YesterdaySaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
As a Staff Engineer in the InfraSec team, you'll lead the design and deployment of security solutions for cloud platforms, automate monitoring, and manage security tooling while mentoring a small team of SREs.
Top Skills:
AnsibleAWSAzureCloudFormationGCPGoTerraform
Reposted 10 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
This role involves building and maintaining observability services, ensuring service reliability, and collaborating with other teams on best practices.
Top Skills:
AWSFluentbitGCPJaegerKubernetesAzureQuickwitSplunkVectorVictoriametrics
Reposted 12 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
As a Staff Site Reliability Engineer, you will empower developers by optimizing MongoDB Atlas, ensuring seamless performance across multiple cloud platforms while fostering a supportive culture.
Top Skills:
AWSGCPAzureMongoDB
Sales • Software • Automation
Join the Infrastructure Team to build and maintain critical systems, automating database lifecycles and enhancing disaster recovery with a focus on resilience and simplicity.
Top Skills:
AnsibleArgocdAWSClickhouseDockerElasticsearchFlaskGithub ActionsGrafanaKubernetesMongoDBPostgresPythonRedisTerraform
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Software
The AWS Cloud Architect will design, build, and optimize cloud infrastructure, ensuring scalability and security while mentoring junior SREs and defining cloud strategy.
Top Skills:
AnsibleAws Api GatewayAws CloudfrontAws CloudtrailAws CloudwatchAws DocumentdbAws Ec2Aws EksAws LambdaAws RdsAws S3Aws Secrets ManagerAws SsmDockerGrafanaHashicorp ConsulHashicorp TerraformHashicorp VaultKubernetesNew RelicPrometheus
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will support, maintain and grow the Atlas platform, focusing on automating processes and running multi-cloud environments.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 23 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The role involves maintaining and improving CI/CD infrastructure using Argo Workflows and Kubernetes, ensuring effective deployment for engineering teams.
Top Skills:
AWSAzureGoGCPKubernetesPython
eCommerce • Legal Tech • Professional Services • Software • Data Privacy
The Site Reliability Engineer will ensure systems run smoothly, work with automation tools, resolve issues, and drive operational improvements.
Top Skills:
AWSAzureCloudFormationDockerGCPGrafanaKubernetesMemcachedNew RelicOpentelemetryPostgresPrometheusPulumiRedisSentryTerraform
Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy
The Principal Site Reliability Engineer ensures SaaS products are fast and stable, optimizes performance, automates deployments, and champions best practices for system operations.
Top Skills:
.NetAksAnsibleAppdynamicsAzureAzure DevopsBashC#Cloud NetworkingCosmosDatadogDynatraceEksFirewallHarnessIdera Sql Diagnostic ManagerJavaJenkinsKubernetesLoad BalancingNew RelicPowershellPythonRedgate Sql MonitorSolarwinds Database Performance AnalyzerSQLTerraform
Fintech • Machine Learning • Payments • Software • Financial Services
Lead diverse technology projects in a fast-paced environment while improving performance and reliability of services using distributed microservices. Collaborate on cloud-based solutions and mentor other engineers.
Top Skills:
AWSCassandraDockerKafkaNode.jsOpensearchPostgres
AdTech • eCommerce • Food • Marketing Tech • Retail
The Senior Site Reliability Engineer ensures system reliability and performance through automation and operational processes in a cloud-native environment, mentoring junior engineers and collaborating with cross-functional teams.
Top Skills:
AksArgocdAWSAzureBashDatadogDockerElkGCPGithub ActionsGoJavaKafkaKubernetesPrometheusPythonRedisSpring BootTerraformTomcat
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
YesterdaySaved
Travel
The Senior Site Reliability Engineer will enhance platform tooling, drive automation of infrastructure components, and support teams by ensuring reliable and scalable cloud infrastructure on Google Cloud.
Top Skills:
BashDatadogGoogle Cloud PlatformHelmIstioKubernetesKustomizePythonTerraform
Artificial Intelligence • Computer Vision • HR Tech • Machine Learning • Software
The Site Reliability Engineer II will manage and ensure the reliability and efficiency of SaaS application platforms, leveraging tools for automation, monitoring, and incident response while collaborating with various teams.
Top Skills:
AnsibleArgocdAWSAzureCisDockerElasticsearchFips 140-2Fips 140-3GCPGoGrafanaHelmIptablesJavaJenkinsKubernetesLinuxMongoDBMssqlMySQLPostgresPrometheusPythonSelinuxSolrStigTerraform
Retail
The Lead Site Reliability Engineer will improve digital platform applications, manage projects, ensure system reliability, and handle production incidents.
Top Skills:
Ci/CdIstioKubernetesLinuxPrometheusRedisTerraform
Artificial Intelligence • Machine Learning • Robotics • Automation
The role involves leading root cause analysis, troubleshooting production systems, improving system reliability, and collaborating across engineering and operations teams.
Top Skills:
ElasticGitlabGrafanaItsm ToolsJIRAKubernetesLogic MonitorPower BIPrometheusTableauVMware
Cloud • Software
As a Site Reliability Engineer, you'll manage technical escalations, ensure system reliability, collaborate with engineering teams, and participate in on-call rotations.
Top Skills:
AnsibleAzureBashC#ChefElkGitGithub ActionsGitlabGrafanaJenkinsLinux/UnixPrometheusPulumiPythonSplunkSvnTcp/IpTerraform
Artificial Intelligence • Blockchain • Internet of Things • Machine Learning • Software • App development • Automation
Join the Gigster Talent Network as an SRE Support Engineer, providing support for scalable applications and cloud services, including troubleshooting and improving internal tools.
Top Skills:
AnsibleAWSBashDatadogDockerGCPGrafanaKafkaKubernetesPrometheusPuppetPythonSparkSplunkTerraform
News + Entertainment
The role involves designing scalable infrastructure, collaborating for reliability, automating monitoring and response tools, managing incidents, and promoting reliability culture at Netflix.
Top Skills:
AWSAzureGCPGoJavaKubernetesPythonTerraform
Information Technology • Security • Cybersecurity
The Staff/Principal Site Reliability Engineer leads infrastructure initiatives, architects solutions for cloud and SaaS, and collaborates cross-functionally to enhance reliability and innovation.
Top Skills:
AWSBashBazelCuelangDatadogGitopsGoGrafanaHelmKubernetesLinuxPrometheusPythonTerraform
Software • Energy
As a Lead Site Reliability Engineer, you'll manage the Product Reliability team, ensuring product performance, scalability, and availability while delivering technical improvements and mentoring team members.
Top Skills:
AWSDockerKubernetesLinuxPostgresPythonRabbitMQTerraform
Blockchain • Software
As a Senior Engineer, SRE/DevOps, you will enhance blockchain infrastructure reliability, automate deployment, and collaborate on CI/CD practices while ensuring security and performance optimization.
Top Skills:
AnsibleAWSBashCloudtrailCloudwatchCosmosDockerElk-StackEthereumGCPK8SKubernetesOpsgeniePingdomPythonTerraform
Reposted YesterdaySaved
Blockchain • Fintech • Financial Services • Cryptocurrency
The VP of Site Reliability Engineering will design and maintain large-scale Linux infrastructure, develop automation scripts, manage network devices, ensure security compliance, and support critical infrastructure.
Top Skills:
AnsibleAWSAzureBashBgpDockerGCPKubernetesLinuxOspfPerlPuppetPythonSaltstackVlans
Financial Services
The Senior Cluster Site Reliability Engineer will enhance the research compute cluster's uptime, reliability, and performance through engineering and operational improvements, ensuring high availability for researchers working on machine learning problems.
Top Skills:
AnsibleAWSAWSCephDockerElkGCPGCPGrafanaHorovodHpcInfinibandKubeflowKueueLokiLustreMlflowOpentelemetryPodmanPrometheusPythonRdmaRubyS3SingularitySlurmTerraform
Cloud • Security • Software • Analytics
As an SRE, you'll ensure scalability and reliability for Arista's CloudVision service, focusing on automation, performance, and safety in production environments.
Top Skills:
AnsibleBashGoGoogle Cloud PlatformGoogle Kubernetes EngineKubernetesPulumiPython
Top Boston Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results































