Bespoke Labs Logo

Bespoke Labs

DevOps / Site Reliability Engineer

Posted 2 Days Ago
Remote
Hiring Remotely in USA
Mid level
Remote
Hiring Remotely in USA
Mid level
As a DevOps/Site Reliability Engineer, you will manage cloud infrastructure, CI/CD pipelines, and improve system reliability and performance while supporting AI data pipelines.
The summary above was generated by AI

About Bespoke Labs

Bespoke Labs is an AI research and data company building the datasets, benchmarks, and evaluation infrastructure that power frontier AI models. We're backed by leading investors, trusted by top AI labs, and have research accepted at venues like ICLR 2026. Our team is small, moves fast, and has an outsized impact on how the next generation of AI is built.

The Role

We're looking for a mid-level DevOps / Site Reliability Engineer to own and scale our cloud infrastructure. You'll work closely with engineering and ML teams to keep our systems reliable, observable, and fast — directly supporting the infrastructure that powers AI data pipelines at scale.

What You'll Do

  • Own cloud infrastructure on AWS — EC2, EKS, RDS, S3, IAM, VPC

  • Manage Kubernetes clusters and container orchestration end-to-end

  • Build and maintain CI/CD pipelines using GitHub Actions or similar

  • Implement monitoring, alerting, and observability stacks (Prometheus, Grafana, or DataDog)

  • Improve reliability, performance, and security of production systems

  • Automate infrastructure with Terraform or similar IaC tools

  • Debug and resolve issues across complex, distributed systems

  • Participate in design reviews and help raise the infrastructure bar

What We're Looking For

  • 3–5 years in DevOps, SRE, or infrastructure engineering

  • Strong AWS experience — EKS, EC2, RDS, S3, IAM

  • Kubernetes — deployment, scaling, troubleshooting in production

  • CI/CD pipelines — GitHub Actions, ArgoCD, or similar

  • Infrastructure as Code — Terraform, Pulumi, or CDK

  • Python or Go scripting

  • Experience working in production environments with real users

  • Comfort with ambiguity and ability to operate autonomously

Nice to Have

  • Experience supporting ML training workloads or GPU clusters

  • Familiarity with distributed computing or large-scale data pipelines

  • Prior work at an AI, ML, or data company

  • Open-source contributions or published technical writing

What We Offer

  • Competitive compensation and meaningful equity

  • Direct impact on frontier AI model training and evaluation infrastructure

  • Flexible, remote-friendly environment with low bureaucracy

  • A small, high-caliber team with deep AI research expertise

  • Health, wellness, and learning & development benefits

Similar Jobs

24 Days Ago
Remote
United States
Senior level
Senior level
Logistics • Software • Transportation
Lead and mentor teams in DevOps and SRE, architect scalable Azure Cloud infrastructure, implement CI/CD and IaC, ensure database reliability, and drive cross-functional collaboration.
Top Skills: Azure CloudAzure DevopsCi/CdCosmosdbDockerElkGrafanaKubernetesMySQLPostgresPrometheusRedisSQL ServerTerraform
16 Days Ago
Remote or Hybrid
United States
154K-199K Annually
Senior level
154K-199K Annually
Senior level
3D Printing • Aerospace • Hardware • Robotics • Software
Lead the reliability and scalability of BRINC's production systems, building secure cloud infrastructure and improving incident response. Collaborate with teams for optimal system performance.
Top Skills: AWSInfrastructure As CodeJavaScriptNode.jsPython
16 Days Ago
In-Office or Remote
2 Locations
Senior level
Senior level
Healthtech
The SRE will design and implement platform solutions, maintain cloud environments, monitor and troubleshoot production issues, and automate tasks to improve efficiency.
Top Skills: AnsibleAWSDockerGCPGitIacLinuxMySQLPHPTerraform

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account