Sr. Engineer, Monitoring & Observability
We are a rapidly growing company that’s revolutionizing the way the restaurant industry does business by pairing technology with an unrivaled commitment to customer success. We help restaurants streamline operations, increase revenue, and deliver amazing guest experiences through our platform that combines restaurant point of sale, guest-facing technology, and award-winning customer support. As a Toaster, you will be challenged to take on meaningful projects that will help shape the future of the company. Join us as we empower the restaurant community to delight guests, do what they love, and thrive.
The Performance team is looking for a self-motivated individual who loves monitoring distributed systems. Toast engineering teams are pushing the boundaries of Android performance and building a highly reliable and scalable AWS-hosted platform that supports our fast growing customer base. The team’s mission is to drive architectural decisions through observation, measurement, and validation. We build performance testing and observability frameworks that make it easy for engineers to quickly get self-service performance and scalability feedback about their proposed code and infrastructure changes. Join the Performance Engineering team to champion performance, deliver fast applications, and drive our platform to architectural excellence.
Recent projects include:
- Building out an observability framework that monitors the health and performance of our fleet of tens of thousands of devices in production.
- Using Espresso, JMeter, and the ELK stack, we built a simulation of a high volume customer that we use to run various experiments with.
- Deploying a synthetic monitoring solution in production that tracks, trends, and alerts on the performance of our critical transactions.
As a senior engineer for monitoring and observability, you will:
- Take ownership over the existing monitoring tools and observability practices
- Design and build systems to provide real-time operational insight to Toast engineering teams
- Utilize and build on top of best-in-class SaaS tools (DataDog, NewRelic, Sumologic) when it makes sense, build it yourself when it doesn't
- Help define the direction of monitoring systems across our android devices and cloud-hosted infrastructure
- Partner with product and engineering teams to promote best practices and provide advice on how to implement features that are instrumented and observable
- Configure and instrument devices/applications/servers such as Android devices, java based services, & AWS resources to report metrics into monitoring tools
- Generate, manage, and report the application performance data captured by the monitoring tools and proactively work with engineering teams in resolving performance issues
- Create and maintain operational dashboards for both real-time and historical trended views
- Mentor and train various teams on how to use our monitoring tools
Do you have the right ingredients?
- Experience with designing and implementing monitoring infrastructure at a high-scale SaaS company
- Solid understanding of systems monitoring, alerting, and analytics (Splunk, Sumologic, New Relic, Dynatrace, DataDog, Librato, Graphite, ELK stack)
- Proficient in production monitoring concepts including synthetic, real user, application performance, system, log, distributed tracing, and dashboards