Top Remote Senior Site Reliability Engineer Jobs in Boston, MA

Reposted 6 Days AgoSaved
Remote
2 Locations
Senior level
Senior level
Big Data • Cloud • Information Technology
The Site Reliability Engineer at Iron Mountain will troubleshoot escalated tickets, manage Windows Server builds, perform security patching, and collaborate with customers and vendors to resolve issues and maintain systems.
Top Skills: CloudComputeHyper-Converged InfrastructureLinuxMicrosoft Endpoint Configuration ManagerNetworkNutanixPowershellRubrikStorageVirtualizationWindows Server
6 Days AgoSaved
Remote
US
Senior level
Senior level
Information Technology • Software • Cybersecurity • Automation
Design, build, and operate an agentic platform to automate vulnerability remediation and incident response while ensuring reliability in security operations.
Top Skills: DatadogGitGrafanaLinearLlmsOpentelemetryPrometheusSlack
7 Days AgoSaved
Remote
USA
180K-210K Annually
Senior level
180K-210K Annually
Senior level
Artificial Intelligence • Insurance • Software • Automation
The Staff Site Reliability Engineer will build and scale infrastructure for Assured's platform, automate delivery, enhance observability, and lead mentoring initiatives.
Top Skills: AWSKubernetesPostgresTerraform
Reposted 7 Days AgoSaved
Remote
United States
170K-200K Annually
Senior level
170K-200K Annually
Senior level
Software
Lead SRE to define SRE strategy, architecture, and roadmap; design and operate containerized, compliant cloud environments; build observability, incident management, automation, and developer platform capabilities; mentor SRE team and collaborate with security, compliance, and product teams to ensure reliability at scale.
Top Skills: AWSAws MarketplaceAzureAzure MarketplaceGCPGoogle Cloud MarketplaceGrafanaKubernetesPrometheusTerraform
8 Days AgoSaved
Remote
USA
Mid level
Mid level
Information Technology • Software
As a DevOps/Site Reliability Engineer, you will manage cloud infrastructure, CI/CD pipelines, and improve system reliability and performance while supporting AI data pipelines.
Top Skills: AWSDatadogEc2EksGithub ActionsGoGrafanaIamKubernetesPrometheusPythonRdsS3Terraform
8 Days AgoSaved
Remote
United States
120K-160K Annually
Senior level
120K-160K Annually
Senior level
Healthtech • Other • Software
The role involves managing PostgreSQL services, ensuring high availability and performance, driving incident response, automating tasks, and improving observability for a 24x7 SaaS platform.
Top Skills: AnsibleBashDatadogGrafanaHaproxyNew RelicPgbackrestPgbouncerPostgresPowershellPrometheusPythonRepmgrTerraform
Reposted 8 Days AgoSaved
Remote
USA
Mid level
Mid level
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills: AWSKubernetesTerraformTerragrunt
Reposted 8 Days AgoSaved
Remote
US
136K-177K Annually
Senior level
136K-177K Annually
Senior level
Big Data • Machine Learning • Software • Analytics
As a Lead Site Reliability Engineer, you will drive the reliability strategy, improve system health, lead incident management, and mentor engineers for a multi-region SaaS platform.
Top Skills: ArgocdC++Ci/CdCloud PlatformsDatadogGitopsGrafanaInfrastructure As CodeJavaJavaScriptKubernetesPython
Reposted 8 Days AgoSaved
Remote
United States
Junior
Junior
Computer Vision • Information Technology • Machine Learning • Natural Language Processing • Real Estate • Software
The SRE will maintain infrastructure for SaaS products on AWS, support developers, manage platform components, and handle IT tasks.
Top Skills: AWSComputer VisionIacLarge Language ModelsNlpTerraform
Reposted 8 Days AgoSaved
Remote
United States
205K-270K Annually
Senior level
205K-270K Annually
Senior level
Artificial Intelligence • Other • Sales • Software
The role involves designing and advancing infrastructure for the engineering team, ensuring the reliability of Kubernetes clusters, automating operations, and building machine learning infrastructure.
Top Skills: ArgoAWSAzureCloudFormationFluxGithub ActionsGoGCPKubernetesPostgresPythonTerraform
9 Days AgoSaved
Remote
United States
66K-88K Annually
Mid level
66K-88K Annually
Mid level
Cloud • Information Technology
The Site Reliability Engineer I is responsible for supporting Backblaze’s infrastructure stability by addressing customer issues, monitoring system health, and improving operational processes through documentation and automation.
Top Skills: AnsibleLinuxZabbix
9 Days AgoSaved
Remote or Hybrid
United States
165K-190K Annually
Mid level
165K-190K Annually
Mid level
Artificial Intelligence • Healthtech • Information Technology • Software
As the first Site Reliability Engineer in the US, you'll ensure platform stability and oversee incident responses during PST hours, bridging infrastructure and code, while improving operability and compliance in a medical-device environment.
Top Skills: AWSElixirKubernetesTerraform
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
9 Days AgoSaved
Remote
6 Locations
320K-489K Annually
Expert/Leader
320K-489K Annually
Expert/Leader
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Lead the design and operation of large scale Kubernetes clusters, ensuring high availability and performance while supporting system lifecycle and reliability improvements.
Top Skills: ContainersGoKubernetesLinuxNetworkingOpenstackPerlPythonRuby
Reposted 18 Days AgoSaved
In-Office or Remote
2 Locations
95K-171K Annually
Junior
95K-171K Annually
Junior
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.
Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform
Reposted 10 Days AgoSaved
Remote
USA
156K-288K Annually
Mid level
156K-288K Annually
Mid level
Computer Vision • Machine Learning • Software
As a Site Reliability Engineer, ensure the reliability, performance, and scalability of Ditto's cloud infrastructure by developing observability solutions, leading incident management, and collaborating with product engineering teams.
Top Skills: AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Reposted 10 Days AgoSaved
Remote
United States
Senior level
Senior level
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills: Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Reposted 10 Days AgoSaved
Remote
United States
175K-275K Annually
Mid level
175K-275K Annually
Mid level
Software
As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.
Top Skills: AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
Reposted 10 Days AgoSaved
Remote
United States
Senior level
Senior level
Edtech
The Lead Software Engineer will lead the SRE team, focusing on reliability, performance optimization, security, and mentoring developers, while improving overall platform resilience.
Top Skills: ActivejobAnsibleAWSAws CloudwatchEc2EcsElasticsearchGitGCPGoogle Cloud StackdriverJenkinsJIRAKubernetesMemcachedMongoDBNew RelicNode.jsPostgresRedisRuby On RailsSidekiqSpinnakerTerraformTerragrunt
Reposted 16 Days AgoSaved
Easy Apply
Remote
United States
Easy Apply
130K-140K Annually
Senior level
130K-140K Annually
Senior level
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
The Senior Site Reliability Engineer will manage system incidents, improve monitoring and logging, optimize database infrastructure, and collaborate on scaling systems efficiently.
Top Skills: AWSClickhouseKubernetesMySQLPostgresRedis
Reposted 11 Days AgoSaved
Remote
United States
Senior level
Senior level
Healthtech
Develop and implement processes to ensure high availability and reliability of services. Responsibilities include incident management, automation, capacity planning, and risk mitigation.
Top Skills: AWSAzureDatadogDockerGrafanaJavaScriptNew RelicPrometheusPythonRubySplunkTerraform
Reposted 11 Days AgoSaved
Remote
United States
190K-215K Annually
Senior level
190K-215K Annually
Senior level
Internet of Things • Cybersecurity
The Site Reliability Engineer will manage AWS GovCloud infrastructure, ensuring compliance and high availability while driving automation, security, and incident response best practices.
Top Skills: AnsibleAws GovcloudBashDockerElk StackGitlab Ci/CdGrafanaJenkinsKubernetesPrometheusPythonTerraform
14 Days AgoSaved
Remote
United States
100K-110K Annually
Mid level
100K-110K Annually
Mid level
Healthtech • Software
The SRE Technical Project Manager will lead project delivery, incident management, automation processes, and uptime communication, partnering with SRE and development teams to ensure system stability and scalability.
Top Skills: Ai BotsDatadogJIRAJira Service ManagementMs TeamsOpsgeniePagerduty
Reposted 14 Days AgoSaved
Remote
USA
200K-250K Annually
Senior level
200K-250K Annually
Senior level
Software • Cryptocurrency
Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.
Top Skills: Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform
Reposted 14 Days AgoSaved
Remote
United States
165K-200K Annually
Senior level
165K-200K Annually
Senior level
Cloud • Information Technology
As a Staff Site Reliability Engineer, you will enhance cloud product lines, ensuring real-time scalability, collaborating with teams, and automating builds.
Top Skills: AnsibleAWSAzureBashDnsDockerEnvoyGCPGitGoGrafanaHaproxyHTTPJenkinsKafkaKubernetesLinuxMySQLOciOpentelemetryPostgresPrometheusPuppetPythonRedisTcp/IpTelegrafTerraformTls
Reposted 14 Days AgoSaved
Remote
United States
Senior level
Senior level
Software
As a Site Reliability Engineer, you will enhance system reliability, manage cloud services, respond to incidents, and support network systems.
Top Skills: AutomationCisco RoutingCloud ServicesF5 Load BalancingFortinet FirewallsInfrastructure AutomationMonitoringNetworking
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account