SentinelOne Logo

SentinelOne

Senior Manager, Site Reliability Engineering

Posted 4 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in United States
202K-278K Annually
Senior level
Remote
Hiring Remotely in United States
202K-278K Annually
Senior level
Lead the Site Reliability Engineering team to ensure product reliability and scalability. Collaborate with various teams to address performance and incident management, while promoting best practices and processes for SRE.
The summary above was generated by AI
About Us

At SentinelOne, we’re redefining cybersecurity by pushing the limits of what’s possible—leveraging AI-powered, data-driven innovation to stay ahead of tomorrow’s threats.

From building industry-leading products to cultivating an exceptional company culture, our core values guide everything we do. We’re looking for passionate individuals who thrive in collaborative environments and are eager to drive impact. If you’re excited about solving complex challenges in bold, innovative ways, we’d love to connect with you.

What are we looking for?

We are seeking an experienced engineering and operational Senior Manager to lead our Site Reliability Engineering (SRE) team at SentinelOne. As the Senior Manager of SRE, you will manage a team of SRE professionals responsible for ensuring the reliability and scalability of our products and production services, focusing on the experience our customers have in production every day. You will work closely with other engineering teams to identify and address availability, performance, and capacity issues, and you’ll be a key partner for our externally facing teams including Support, Customer Success, and Sales Engineering. This is a highly visible role within S1 with frequent executive communication opportunities, and is a great opportunity to do good work with good people all around the world.

As a team we value

  • Thinking from first principles, understanding second order impacts 
  • Curiosity to understand new systems, their operating principles and limitations 
  • Strong operational ownership and a desire to reduce toil via automation
  • A drive to learn, especially from prior failures
  • Courage to take risks and make things happen 
  • Empathy and humility to collaborate effectively with peers and across teams

What will you do?

  • Grow and lead a team of SRE professionals, including setting performance goals and measuring deliverables against key metrics, while evolving those metrics as S1 grows and needs develop
  • Invest in data-driven deep triage on recurring issues, collaborating with other engineering teams to identify and address issues related to reliability, performance, and capacity
  • Develop, improve, and implement processes for the full incident lifecycle including incident management, post-incident analysis, and learning from incidents Lead incident response efforts, including coordinating with other teams to investigate and resolve customer-impacting incidents 
  • Design support model for SRE regarding service maturity and service ownership, including monitoring and alerting improvements and SLI / SLO design and implementation
  • Analyze production metrics and signals to identify areas for improvement and take proactive steps to mitigate issues
  • Develop and implement best practices and standards for Site Reliability Engineering, from day to day operations to hiring and planning 
  • Communicate effectively with cross-functional teams to ensure alignment on objectives and priorities. Deliver outcomes, not just stories and tasks. 

What skills and knowledge should you bring?

  • 8+ years of engineering experience, with at least 4 years in a management role
  • Demonstrated experience leading technical and operational teams at various stages of maturity 
  • Excellent analytical and problem-solving skills
  • Familiarity with modern software development methodologies, tools, and techniques including CI/CD
  • Experience working with cloud-native applications and large scale distributed systems including a working knowledge of technologies such as Kubernetes and Terraform/IaC and cloud providers such as AWS or GCP
  • Experience with various monitoring and alerting techniques and tools, including frameworks and concepts such as SLOs, OTel and Golden Signals as well as tooling such as Prometheus and Grafana 
  • Extensive experience with incident response and management at various layers of the stack across different business needs and applications, including both hands on experience leading incidents/post-incident analysis and experience driving broader incident management initiatives 
  • Ability to thrive in a fast-paced, dynamic environment
  • Driven by curiosity and humility - complex distributed systems are complex, so ask the “silly” question and seek out answers

Why us?

You will be joining a cutting-edge company where you will tackle extraordinary challenges and work with the very best in the industry.

  • Medical, Vision, Dental, 401(k), Commuter, Health and Dependent FSA
  • Unlimited PTO
  • Industry-leading gender-neutral parental leave
  • Paid Company Holidays
  • Paid Sick Time
  • Employee stock purchase program
  • Disability and life insurance
  • Employee assistance program
  • Gym membership reimbursement
  • Cell phone reimbursement
  • Numerous company-sponsored events, including regular happy hours and team-building events

This U.S. role has a base pay range that will vary based on the location of the candidate. For some locations, a different pay range may apply.  If so, this range will be provided to you during the recruiting process. You can also reach out to the recruiter with any questions.

Base Salary Range
$202,400$278,300 USD

SentinelOne is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

SentinelOne participates in the E-Verify Program for all U.S. based roles. 

Top Skills

AWS
Ci/Cd
GCP
Grafana
Kubernetes
Otel
Prometheus
Terraform

Similar Jobs

3 Days Ago
In-Office or Remote
2 Locations
248K-397K
Senior level
248K-397K
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Senior Manager of SRE at NVIDIA leads a team to enhance cloud service reliability, automate operations, and ensure operational excellence through SRE practices.
Top Skills: AnsibleAWSAzureChefElk StackGCPGoGrafanaJaegerKubernetesPrometheusPuppetPythonSplunkTerraform
9 Days Ago
Remote
8 Locations
119K-221K Annually
Senior level
119K-221K Annually
Senior level
Healthtech
Manage site reliability engineering team, optimizing platform performance and reliability through various strategies and collaborating with stakeholders for improved solutions.
Top Skills: AutomationCloud-Based Enterprise-Grade Cloud Systems ManagementFull Stack EngineeringSoftware Engineering
25 Days Ago
Remote
United States
221K-299K
Senior level
221K-299K
Senior level
Cloud • Information Technology
Lead and expand a Production SRE team, enhance infrastructure reliability, implement network automation, and shape SRE practices within the organization.
Top Skills: AnsibleEnvoyExpressGitGoHaproxyJavaScriptJenkinsKafkaMySQLNapalmNode.jsPostgresPythonReactRedisSaltstack

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account