Senior Site Reliability Engineer at PathAI
At PathAI, we're applying Computer Vision and Machine Learning in amazing ways to pathology, healthcare, and detecting cancer and other diseases. We're applying our work to drug development, the clinical space, and as a global health initiative. Since it's the early days for us, there's a lot of work to do and a lot of great products to build, and a well-engineered experience is critical to the success of everything we aspire to do.
We're looking for a skilled Site Reliability Engineer focused on designing, building, and operating our hybrid cloud/on-prem environment. This position will focus on our on-prem AI compute center which will do the heavy lifting of our growing ML teams.
If you're the right candidate, you'll be exercising all the skills you have and building new ones along the way:
- Designing, building, and operating our new data center for our rapidly growing Machine Learning team
- Integrating our new data center with our existing cloud infrastructure to create a seamless hybrid cloud environment
- Using your knowledge of networking, storage, and Linux to create a robust and scalable environment for PathAI
- Improving the capacity of our infrastructure through capacity planning, budgeting and forecasting, and implementation
- Improving the reliability and resilience of our infrastructure through root-cause analysis and reviewing gaps in designs and implementations of our infrastructure
Our employees' skills come in all shapes and sizes, but to be successful in this role with us, you'll at least need:
- Engineering skills. You’re a generalist in the tech-ops space with knowledge of enterprise grade hardware such as routers, firewalls, switches, load balancers, storage arrays and Linux systems.
- 7+ years of relevant experience
- Automation: You work hard to eliminate toil by automating everything through scripting, configuration management tools (Puppet/Chef/Ansible), code, and proper tooling.
- Operations experience. You’ve managed critical production infrastructure and are familiar with incident response, scaling, and rapid growth related challenges.
- You’ve written tooling for SRE teams to use in their day-to-day work.
- Some experience and opinions on virtualization, containerization, or container orchestration platforms.
- A bachelor's degree in Computer Science or equivalent experience
- An insatiable intellectual curiosity and the ability to learn quickly in a complex space
For the right candidate, we'll offer a competitive salary plus equity. Your compensation is rounded out by a strong benefits package:
- Flexible work hours, with work-from-home options available for many roles
- Three weeks of paid leave per year, an additional two weeks of sick time, plus extended holidays and team-approved leave
- Ten days of 100% subsidized childcare per year
- Healthcare, vision, and dental insurance plans (HMO or PPO), with voluntary add-ons available for dependent care, life, and accident coverage
- Commuter benefit available for public transit or parking
- Convertible sit-stand desks
- Weekly in-office yoga classes
- Free in-office lunch on Tuesdays and Fridays
- Snacks and drinks in the office – which currently include a mountain of Milano cookies, endless Fruit Snacks, as well as cold brew coffee and kombucha on tap, among many other options. Our in-house Snackologist is also happy to take your requests!
Most importantly, you'll be doing important work with a team of people you'll genuinely enjoy spending the day with.
PathAI is an equal opportunity employer, dedicated to creating a workplace that is free of harassment and discrimination. We base our employment decisions on business needs, job requirements, and qualifications — that's all. We do not discriminate based on race, gender, religion, health, personal beliefs, age, family or parental status, or any other status. We don't tolerate any kind of discrimination or bias, and we are looking for teammates who feel the same way.
PathAI does not accept unsolicited submissions from third-parties.