Sr Site Reliability Engineer - DBE Automation and Observability at Grubhub (Greater Boston Area, MA or Remote)
More About the Role
GrubHub is looking for an experienced SRE specialized in managing large critical data persistence platforms including Cassandra and Elasticsearch on AWS. Grubhub platform supports high volume applications in a container based microservice architecture running on multiple AWS regions in fully Active/Active mode. The entire platform is powered by a very large multi-datacenter Cassandra infrastructure for persistence, and Elasticsearch for indexing and scaling search and content experience. You will be working with a team of passionate and skilled engineers responsible for automation, scaling, tuning, and troubleshooting of Elasticsearch and Cassandra databases. You will also collaborate and work with a diverse group of engineers across the organization to design and engineer solutions
We're all about connecting hungry diners with our network of over 300,000 restaurants nationwide. Innovative technology, user-friendly platforms and streamlined delivery capabilities set us apart and make us an industry leader in the world of online food ordering. When you join our team, you become part of a community that works together to innovate, solve problems, grow, work hard and have a ton of fun in the process!
Why Work For Us
Grubhub is a place where authentically fun culture meets innovation and teamwork. We believe in empowering people and opening doors for new opportunities. If you're looking for a place that values strong relationships, embraces diverse ideas-all while having fun together-Grubhub is the place for you!
The Impact You Will Make
- Manage large critical Cassandra and Elasticsearch clusters supporting Millions of transactions per day
- Build systems to automate all build and maintenance tasks using Ansible and python
- Develop self-service tools to allow engineers to manage and provision resources with GrubHub best practices and standards
- Monitor cluster availability, read/ write latencies, and other key performance metrics to proactively identify SLO misses and help mitigate issues
- Evaluate new technologies, tools, and software versions. Test, plan and develop roadmaps
- Tune Cassandra and ES databases for optimizing throughput and read /write latencies
- 24X7 on-call rotation support with rest of team for rapid incident response
- Implement DR strategies, including backups and recovery techniques with minimal downtime.
- Work with other engineers to manage our data persistence integration and performance with the GrubHub platform.
- Proactively monitor and scale Elasticsearch/Cassandra clusters to handle growth in traffic
What You Bring to the Table
- Experience developing backend applications in Python or Java
- Experience managing, working or developing large Elasticsearch clusters in highly available 24x7 production environments
- Experience automating the maintenance of infrastructure using Python and Ansible or similar tools.
- Strong experience managing automated cloud infrastructures on AWS or other major cloud providers.
- Experience managing large Cassandra clusters in production is a strong plus.
- Experience working with docker is a plus
- Ability to quickly learn new concepts and technologies and adapt to changing needs
Additional Content :
- How Grubhub uses Elasticsearch
- How Grubhub guarantees critical microservice actions
And Of Course, Perks!
- Flexible PTO/PTO. Grubhub employees enjoy a generous amount of time to recharge.
- Health and Wellness. Excellent medical benefits, employee network groups and paid parental leave are just a few of our programs to support your overall well-being.
- Competitive Pay. You'll receive a competitive base salary with eligibility for generous incentives, bonuses, commission or RSUs (role-specific).
- Learning and Career Growth. Your personal and professional development is a priority at Grubhub. We empower you to be a leader and grow your career through training, coaching and mentorship opportunities.
- MealPerks. Get meals on us! Our employees get a weekly Grubhub credit to enjoy and support local restaurants.
- Fun. Every Grubhub office has an employee-led Culture Crew that connects people through fun, meaningful events and initiatives like Wellness Wednesdays, Slack competitions and virtual happy hours!
- Social Impact. At Grubhub we believe in giving back through programs like the Grubhub Community Relief Fund and donating $1 million to the Equal Justice Initiative in 2020. Employees are also given paid time off each year to support the causes that are important to them.
- COVID-19 Response. All of our employees are currently working from home and will be for the foreseeable future. We look forward to seeing everyone in-office when it's safe to return.
Grubhub is an equal opportunity employer. We welcome diversity and encourage a workplace that is just as diverse as the customers we serve. We evaluate qualified applicants without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status, and other legally protected characteristics. If you're applying for a job in the U.S. and need a reasonable accommodation for any part of the employment process, please send an email to [email protected] and let us know the nature of your request and contact information. Please note that only those inquiries concerning a request for reasonable accommodation will be responded to from this email address.