Supabase Logo

Supabase

Site Reliability Engineer

Posted 17 Hours Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
Embed with service teams to define SLIs/SLOs and error budgets, run Operational Readiness Reviews, improve incident-to-improvement pipelines, advise on resilience and architecture, reduce operational toil through automation, and shape org-wide on-call practices and operational maturity.
The summary above was generated by AI
About Supabase

Supabase is the Postgres development platform, built by developers for developers. We provide a complete backend solution including Database, Auth, Storage, Edge Functions, Realtime, and Vector Search. All services are deeply integrated and designed for growth.

About the Role

Supabase manages millions of Postgres instances and is growing. We have strong teams across observability, release engineering, and incident management — and we're concentrating our reliability efforts into a dedicated SRE practice that ties the discipline together across the platform.

You'll be embedded within Service Operations, and your primary job is to make every engineering team more reliable — not by owning their infrastructure, but by establishing the practices, frameworks, and feedback loops that let them own reliability themselves. You'll work across the org: sometimes setting the standard, sometimes pair-programming a fix, sometimes helping a team define their error budget, sometimes telling them it's exhausted.

This role is ideal for someone who has a strong vision for how SRE should work and thrives in async, fast-paced environments where influence matters more than authority.

What You'll Own
  • Partner with service teams to define meaningful SLIs and SLOs grounded in customer experience, and build the error budget policies that turn them into engineering decisions

  • Own and evolve the Operational Readiness Review (ORR) process — conducting reviews for new services and major changes across observability, alerting, runbooks, capacity, and graceful degradation

  • Strengthen the incident-to-improvement pipeline: connecting postmortem findings to operational readiness gaps, identifying repeat failure patterns, and driving systemic fixes

  • Act as the reliability expert teams pull in for architecture reviews, failure mode analysis, dependency mapping, and resilience design

  • Identify and quantify operational toil across the org, and build or advocate for automation that eliminates it

  • Help teams design sustainable on-call practices: alert quality, escalation paths, runbook coverage, and noise reduction

  • Track and report on org-wide operational maturity, surfacing systemic gaps and driving remediation

You Might Be a Good Fit If You
  • Have 7+ years of experience in SRE, production engineering, or reliability-focused roles, including experience shaping SRE practices and driving adoption across engineering teams

  • Have a software engineering mindset — you write code and build tools, not just configure them

  • Have hands-on experience defining and operationalizing SLOs/SLIs at scale, including error budget policies that actually influenced engineering decisions

  • Have deep experience with incident response, postmortem facilitation, and turning incident learnings into systemic improvements

  • Have worked with large-scale multi-tenant systems (bonus: managed database platforms or Postgres)

  • Are proficient with cloud infrastructure (AWS preferred) and infrastructure-as-code (Pulumi preferred, Terraform/CDK also acceptable)

  • Communicate clearly and persuasively — this role requires influencing without authority across a distributed org

  • Have experience in async or globally distributed teams

  • Are energized by making other teams more effective rather than being the one who fixes everything

Nice to Have
  • Experience with Kubernetes-based platform operations

  • Familiarity with OpenTelemetry, VictoriaMetrics, Grafana, or similar observability tooling

  • Experience building developer-facing reliability tooling (SLO dashboards, ORR frameworks, toil tracking, DORA metrics)

What We Offer
  • Fully Remote

    We hire globally. We believe you can do your best work from anywhere. There are no Supabase offices, but we provide a WeWork membership or co-working allowance you can use anywhere in the world.

  • ESOP

    Every team member receives ESOP (equity ownership) in the company. We want everyone to share in the upside of what we’re building together.

  • Tech Allowance

    Use this budget to set up your ideal work environment—laptop, monitor, headphones, or whatever helps you do your best work.

  • Health Benefits

    Supabase covers 100% of health insurance for employees and 80% for dependents, wherever you are. Your wellbeing and your family’s health are important to us.

  • Annual Off-Sites

    Once a year, the entire company gathers in a new city for a week of connection, collaboration, and fun. It’s a highlight of our year.

  • Flexible Work

    We operate asynchronously and trust you to manage your own time. You know what needs to be done and when.

  • Professional Development

    Every team member receives an annual education allowance to spend on learning—courses, books, conferences, or anything that supports your growth.

About the Team

Supabase was born-remote and open-source-first. We believe our globally distributed team is our secret weapon in building tools developers love.

  • 280+ team members

  • 55+ countries

  • 20+ languages spoken

  • $500M raised

  • 500,000+ community members

We move fast, build in public, and use what we ship. If it’s in your project, we probably use it in ours too. We believe deeply in the open-source ecosystem and strive to support—not replace—existing tools and communities.

Hiring Process

We keep things simple, async-friendly, and respectful of your time:

  1. Apply – Our team will review your application.

  2. Intro Call – A short video chat to get to know each other.

  3. Interviews – Up to four calls with:

    • Team Leads

    • Future teammates

    • Someone cross-functional from product, growth, or engineering (depending on the role)

    • Someone from our leadership/founding team

  4. Decision – We may follow up with a final question or go straight to offer.

All communication is remote and we aim to move fast.

Similar Jobs

Yesterday
In-Office or Remote
Expert/Leader
Expert/Leader
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Define and scale SRE standards across teams, implement SLOs/SLIs/error budgets, build observability and resiliency patterns, drive automation and AIOps, improve reliability for large-scale Azure cloud systems, and influence engineering and platform teams.
Top Skills: Ai/MlAiopsAutomationAzureError BudgetsIncident ManagementLogsObservability (MetricsOpentelemetrySlisSlosTracing)
2 Days Ago
Easy Apply
Remote or Hybrid
2 Locations
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Maintain and improve multi-cloud Kubernetes infrastructure, CI/CD (Argo Workflows/ArgoCD), observability, and networking. Build reliable continuous deployment tooling and onboarding flows, provide internal support, collaborate across Platform Engineering, contribute upstream (open-source/operators), and participate in a 24/7 on-call rotation to resolve deployment infrastructure issues.
Top Skills: AlertingArgo WorkflowsArgocdAWSAzureCi/CdContainersDnsGCPGoKubernetesLinuxLoad BalancerObservabilityPythonService MeshTcp/IpTls
3 Days Ago
Easy Apply
Remote or Hybrid
US
Easy Apply
200K-230K Annually
Senior level
200K-230K Annually
Senior level
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account