Boulevard Logo

Boulevard

Staff Reliability Engineer

Posted 3 Hours Ago
Be an Early Applicant
Remote
2 Locations
181K-259K Annually
Expert/Leader
Remote
2 Locations
181K-259K Annually
Expert/Leader
As a Staff Site Reliability Engineer, you'll lead reliability strategies, improve system resilience, and mentor teams on best practices in a hands-on role.
The summary above was generated by AI

Who is Boulevard?

Boulevard provides the first and only client experience platform for appointment-based, self-care businesses. We empower our customers to give their clients more of the magical moments that matter most.

Before launching in 2016, our founders spent months interviewing salon managers and working behind front desks to understand their pain points so we could design a modern, user-friendly platform that meets the unique needs of their business. Our roots may be in hair salons, but we are built for the broader self-care industry, including many types of salons, spas, medspa, barbershops, and more. Our technology not only helps our customers survive but thrive. Take a look at how we (and YOU) can make that happen. 

We have an insatiable curiosity and embrace experimentation. We believe that simple solutions require the most sophistication, and we design each and every detail to maximize potential, power, and impact. Do our values match? Read through our story and what we value the most.

Our team values and celebrates our diverse backgrounds. Being open about who we are and what we do allows us to do the best work of our lives. We believe in equal opportunity for all, and you should too.

Come do the best work of your life at Boulevard.

We’re hiring a Staff Site Reliability Engineer to shape the foundation of Site Reliability Engineering at Boulevard

Here you will not just build infrastructure or tooling, but improve systems at scale, influence reliability across engineering, and drive a reliability strategy. You’ll help teams establish SLOs and build repeatable practices for how teams observe, debug, and improve their services.

Reporting to the Director of Cloud & Reliability, this hands-on technical leadership role will up-level reliability practices and build resilient approaches. You’ll help teams adopt best practices, define what “good” looks like, and partner with teams to get there.

 The Cloud & Reliability group operates on four foundational principles.

  1. Reliable Infrastructure - a foundation of stability, and security.
  2. Developer Productivity - empowering builders to do the right things.
  3. Clear ownership - accountability aligned with ownership. Collaboration, not silos.
  4. Long-term Focus - we engineer for tomorrow.
Key Projects & Initiatives
  • Golden Paths to Production: Establish and evolve paved paths that make production-readiness the default for every service at Boulevard. Build shared tooling, templates, and deployment workflows that encode best practices for observability, testing, and resilience.
  • Shared Systems & Production Tooling: Develop core libraries, shared services, and self-service tooling that improve reliability, resilience, and developer efficiency.
  • Reliability & Fault Tolerance Improvements: Lead initiatives that make the platform more robust, fault-tolerant, and self-healing.
  • Observability & Operational Insight: Enhance Boulevard’s observability stack to turn data into action and insight into reliability. Expand metrics, logging, and tracing coverage across critical systems, ensuring full visibility into production health.
  • Platform Performance Optimization: Drive continuous improvement in system and application performance, ensuring services remain fast, reliable, and cost-efficient. Use observability data to identify bottlenecks and improve service efficiency across compute, network, and storage layers.
What You’ll Do Here
  • Define Boulevard’s Reliability Strategy: Lead the development and evolution of our reliability vision — establishing SLOs, SLIs, and error budgets that balance reliability, performance, and delivery speed. Partner with engineering and product teams to embed reliability as a measurable, shared responsibility.
  • Architect and Scale Resilient Systems: Partner with engineering teams to design, build, and operate scalable, fault-tolerant, and secure distributed systems that power Boulevard’s continued growth and customer trust.
  • Develop Production Tooling and Shared Systems: Create and maintain production-grade tooling, shared libraries, and services that improve system resilience and developer productivity. Build the foundations that make our platform more robust — and make reliability the default for every service.
  • Drive Observability and Operational Excellence: Elevate our observability stack — enhancing metrics, logging, tracing, and alerting — to enable actionable insights, faster incident resolution, and proactive reliability improvements.
  • Establish Golden Paths to Production: Define and maintain paved paths and best practices that enable developers to ship with quality, observability, and resilience built in. Reduce friction, eliminate toil, and make “doing it right” the easiest way to deliver software.
  • Optimize System and Application Performance: Leverage deep observability data to identify, prioritize, and remediate performance bottlenecks across services and infrastructure, ensuring consistently fast, reliable experiences.
  • Automate Everything: Champion automation to eliminate manual toil, streamline operational workflows, and build self-service tooling that empowers developers and embeds reliability into daily development practices.
  • Collaborate Cross-Functionally: Work closely with Product, Platform, and Security to integrate reliability principles into the software development lifecycle (SDLC), from design reviews to production operations.
  • Mentor and Influence Across Engineering: Act as a technical leader and mentor, guiding engineers in scalable system design, capacity planning, and operational excellence — fostering a culture where reliability is everyone’s responsibility.
What You’ll Need to Thrive
  • Deep Systems Expertise: 8–10+ years of experience in systems, infrastructure, or backend engineering, with a track record of building and operating distributed systems at scale. You have a deep understanding of reliability, scalability, and performance in complex, production-grade environments.
  • Reliability Engineering Mindset: Proven experience defining and delivering reliability outcomes through SLOs, SLIs, error budgets, and mature observability practices. You approach reliability as an engineering discipline, not an afterthought.
  • Automation-First Philosophy: Strong background in infrastructure-as-code, scripting, and automation (e.g., Terraform, Python, Go, or similar). You believe in eliminating manual toil and codifying operational excellence into reusable tools and systems.
  • Incident Management Mastery: Experienced in detecting, diagnosing, and mitigating production incidents in high-availability systems. You drive blameless postmortems and translate lessons learned into sustainable reliability improvements.
  • Collaboration & Influence: Exceptional communication and stakeholder management skills. You’re adept at aligning diverse teams, advocating for reliability practices, and influencing without authority — raising the operational bar across engineering.
  • Technical Leadership & Mentorship: Demonstrated ability to mentor engineers, set technical standards, and scale your impact through influence. You thrive on enabling others and shaping a reliability-first culture across the organization.
  • Comfort with Ambiguity: Thrives in dynamic, fast-moving environments. You excel at navigating uncertainty, setting direction where none exists, and iterating quickly toward meaningful impact.

Bonus:

  • Experience with Exlir, Ruby, or Rails.
  • Hands-on experience identifying and improving database performance.

How we’ll take care of you:  

Your starting total cash compensation for this role is between $181,125 and $258,750 depending on your current skills, experience, training, and overall market demands. This salary range is subject to change, and there is always room for growth and advancement

In addition to the wonderful people you’ll get to work with and challenging projects that’ll push you - Boulevard is here to make sure you’re always at the top of your game emotionally, mentally, and physically. 

  • ✨ We’ve got you covered with a 401(k) match plus dental, medical, vision, and life insurance. 

  • 🏝 Take a break whenever you need with our flexible vacation day policy. 

  • 🖥 Fully remote so you can choose where you want to work. You’ll receive a work from home stipend every month. 

  • 💚 Family planning resources and specialized support programs. 

  • 🔮 Equity: get ahead on the ground floor and grow with Boulevard. 

  • 💅 Boulevard Bucks Learning and Development program allows employees to explore businesses in the market we serve.


📲 We recommend following our official LinkedIn page to stay up to date on all things Boulevard life!

Boulevard Labs, Inc. is an Equal Opportunity Employer committed to hiring a diverse workforce and sustaining an inclusive culture. All employment decisions at Boulevard Labs, Inc. are based on business needs, job requirements, and individual qualifications, without regard to race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, gender, sexual orientation, gender identity or expression, veteran status, or any other status protected under federal, state, or local law.

Top Skills

Elixir
Go
Python
Ruby on Rails
Ruby
Terraform

Similar Jobs

19 Days Ago
Remote
2 Locations
Expert/Leader
Expert/Leader
Artificial Intelligence • Productivity • Software • Automation
As a Staff Site Reliability Engineer at Zapier, you will lead reliability strategies to enhance observability, mentor engineers, and drive adoption of reliability practices. You will design for scale, influence organizational culture, and integrate AI tools into workflows for improved performance.
Top Skills: ArgocdAWSDatadogGitlabGoGrafanaKafkaKubernetesOpensearchPrometheusPythonRedisSentryTerraformTypescript
15 Days Ago
Easy Apply
Remote or Hybrid
2 Locations
Easy Apply
199K-239K Annually
Senior level
199K-239K Annually
Senior level
eCommerce • Healthtech • Kids + Family • Retail • Social Media
As a Staff Software Engineer, you will ensure the stability and scalability of systems, manage AWS infrastructure using Terraform, and optimize CI systems while collaborating with engineering teams.
Top Skills: AWSCircleCICronitorDatadogDockerGithub ActionsJenkinsKubernetesMySQLPagerdutyReactRedisRuby On RailsSentrySidekiqTerraform
9 Days Ago
Remote
2 Locations
Senior level
Senior level
Artificial Intelligence • Fintech • Software • Financial Services
Seeking a seasoned SRE to lead reliability for a cloud-native platform, overseeing infrastructure, CI/CD pipelines, observability, and mentoring engineers.
Top Skills: AWSClickhouseGoJavaKafkaKubernetesPulumiTerraform

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account