Team Lead, Engineering - Site Reliability
About Datadog:
We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—allowing for seamless collaboration and problem-solving among Dev, Ops and Security teams globally for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
The Team:
The Site Reliability teams at Datadog are responsible for ensuring that our high-volume, low-latency environments continue to perform around the clock. These teams collaborate closely with our product engineers to ensure that Datadog can monitor millions of servers and containers, ensuring our customers always have dependable and actionable data at their fingertips. You’ll be responsible for shaping the infrastructure of our data-intensive, real-time services as we continue to grow at petabyte scale.
The Opportunity:
As an Engineering Team Lead for SRE team, you will manage a team of engineers, own significant chunks of our architecture, design and build systems at scale, and shape product decisions. You'll work on challenging projects, make an impact, and grow as an engineer and a lead.
You Will:
- Solve a scaling bottleneck in a critical service
- Mentor other engineers on your team
- Design a new service and write an architecture RFC
- Deploy a new feature to production, progressively rolling it out with feature flags
- Investigate and fix a production issue from a service your team owns
- Plan the most important projects to work on next
You Are:
- You have been building applications for 4+ years and know the systems you’ve worked on from top to bottom
- You have significant backend programming experience
- You have managed a team of software engineers
- You have architected, built, and operated distributed systems to solve problems at high scale
- You have a BS/MS/PhD in a scientific field or equivalent experience
- You want to work in a fast-paced, high-growth startup environment that respects its engineers and customers
Bonus Points:
- You've shipped complex projects with teams of engineers
- You've worked at high scale with systems like Redis, Cassandra, Kafka
- You have significant experience with Go, C, or Python
Is this you? Let's chat!
#LI-MF2
Equal Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.
Your Privacy:
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s Applicant and Candidate Privacy Notice.