2022 Site Reliability Engineer I at Chewy
Chewy seeks passionate, tech-minded undergraduate students to join our offices in Boston, MA as a Site Reliability Engineer. Site Reliability Engineering is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demands knowledge across different systems, networking, coding, database, capacity planning, continuous delivery and deployment.
As part of a dynamic team and high growth company, you will work in a fast-paced environment where your work will have an immediate impact on our business. Our collaborative code environment will allow you the opportunity to learn from and work closely with our passionate engineers.
What You’ll Do:
As a Site Reliability Engineer, you will build requirements, design, develop, and maintain solutions related to the core systems running our chewy.com
- Design, implement and maintain large scale clusters with monitoring, logging and alerting
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health
- Troubleshoot production issues & practice sustainable incident response and postmortems and create runbooks
- Build and manage systems, infrastructure and applications through automation
- Write code using programming and/or scripting languages
- Communicate with Operations & Product team members
We are looking to bring in the next generation of engineers who will help us in our mission to become the most trusted and convenient destination for pet parents. Think you have what it takes to join our pack?
Possible start dates for this role are between January 2022 and June 2022.
What You’ll Need:
- Currently enrolled in or completed within the last 6 months a BS/BA degree in Computer Science or related field
- Solid foundation in a programming/scripting language like Bash, Python, Java, Terraform or Golang
- Basic knowledge of cloud technologies, e.g., Infrastructure-as-a-Service (IaaS), Platform-as-a-Service (PaaS), Microsoft Azure or Google Cloud Platform (GCP), Amazon Web Services (AWS)
- Demonstrated ability to quickly and accurately troubleshoot issues
- Interest in designing, analyzing and troubleshooting large-scale distributed systems
- Ability to debug and optimize code and automate routine tasks
- Systematic problem-solving approach coupled with strong communication skills and a sense of ownership and drive
- Strong analytical, structured problem solving, and decision-making abilities
- Excellent oral and written communication skills
- Position may require travel
- Work authorization without current or future employer sponsorship required
- Basic knowledge of configuration management systems like chef or puppet or ansible
- Previous code contributions to open source projects or code samples on GitHub
If you have a disability under the Americans with Disabilities Act or similar law, or you require a religious accommodation, and you wish to discuss potential accommodations related to applying for employment at Chewy, please contact [email protected]