We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way
As an engineer on our Compute team, you will build the next generation scalable infrastructure platform that powers our products around the world. This team gives our service a competitive advantage in engineering velocity and resiliency.
You will work with product teams to meet the evolving needs of our high-throughput, low-latency distributed systems. The decisions you make will have a significant impact not only on Datadog’s infrastructure, but that of our customers.
We're looking for engineers with strong backgrounds in modern infrastructure tech, cloud environments, and distributed systems development. Yes, this includes containers, orchestrators, and service mesh technologies.
- Design and implement our core infrastructure in a manner that remains scalable
- Maintain and debug day-to-day infra operations while assisting engineering teams who rely on it
- Work with our tooling teams to create abstractions that support resilient, high velocity global deployment models across clouds without sacrificing availability
- Prototype and assess technologies in the evolving cloud native landscape
- Research and experiment with new cloud computing technologies and vendors
- Focus on the observability of our infrastructure and deploy pipelines at scale
- Collaborate with a globally distributed engineering team
- Passion and demonstrated experience (At least 3 years) designing, building, and managing resilient applications and infrastructures at scale
- Experience beyond basic usage of orchestration platforms and container runtimes
- Resiliency and availability are a cornerstone of every change
- Strong focus and commitment to the evolving needs of engineering teams
- Good understanding of OS and Linux internals
- Significant programming experience in core project languages (Golang / Python / Ruby)
- SRE experience is a plus