Deep Origin Logo

Deep Origin

Lead Technical DevOps / Infrastructure Engineer

Posted 6 Days Ago
Be an Early Applicant
In-Office
Yerevan
Senior level
In-Office
Yerevan
Senior level
Lead the DevOps engineering team by managing cloud infrastructure, bare-metal compute clusters, and ensuring system reliability while collaborating with R&D teams.
The summary above was generated by AI

Deep Origin is a biotech startup building an operating system for science that transforms how life science research is conducted. Led by Michael Antonov, co-founder of Oculus, and backed by Formic Ventures, we are redefining the infrastructure behind modern drug discovery. As we scale our AI-driven platform and strategic programs, exceptional talent is a critical lever in accelerating our mission to dramatically reduce disease and extend human healthspan.

About the role

We are looking for a Senior DevOps / Infrastructure Engineer to join our existing DevOps team. This is a senior IC role with a broad technical scope: you will own complex initiatives end-to-end, drive collaboration across engineering and science teams, and set a high bar for how we build and operate infrastructure. A significant part of this role is supporting our R&D teams by running and evolving the compute clusters that power bioinformatics pipelines, ML training, and other HPC workloads.

Highly autonomous: able to operate with minimal guidance, prioritize work independently, and take full ownership of infrastructure decisions and outcomes.


RequirementsMust-Have
  • 10+ years of infrastructure and DevOps engineering experience, with a proven track record in senior or lead IC roles
  • Ability to take end-to-end ownership of complex, multi-team initiatives and drive them from design through to production
  • Hands-on experience running HPC or research compute clusters: bare-metal provisioning, Slurm (or equivalent), GPU infrastructure, and shared storage (NFS, Lustre, or similar)
  • Comfortable operating in environments with a mix of cloud, VPS, and bare-metal systems, including legacy or non-standard setups
  • Experience supporting scientific or R&D teams with mixed workloads: long-running CPU batch jobs, GPU training jobs, and interactive compute
  • Deep, hands-on AWS expertise: EKS/Kubernetes, IAM, VPC networking, S3, RDS, and cost management
  • Solid Terraform skills and a principled approach to infrastructure-as-code
  • Strong Linux fundamentals and experience managing multi-node environments at scale
  • Experience owning and improving production observability systems (Prometheus/Grafana, OpenTelemetry, ELK, or similar)
  • Strong security fundamentals: threat modeling, least-privilege access design, vulnerability management, and compliance frameworks
  • Experience owning incident management end-to-end, including process design and continuous improvement
  • Excellent communication skills; able to work directly with researchers and scientists as well as with engineering and leadership
  • Fluent English.
Nice-to-Have
  • Background in biotech, bioinformatics, or scientific computing environments
  • SOC 2 Type II audit experience
  • Monorepo tooling and developer platform engineering. 

Key Responsibilities 
  • Own our cloud infrastructure across AWS and third-party hosting and compute providers; ensure it is reliable, scalable, and cost-efficient
  • Own and operate bare-metal compute clusters: node provisioning, configuration management, networking, secure access, and ongoing reliability
  • Build and maintain configuration management using Ansible (or similar), ensuring reproducible and scalable server provisioning
  • Set up and maintain Slurm for job scheduling across CPU and GPU node pools; ensure researchers can submit, monitor, and manage jobs without DevOps involvement
  • Design and manage cluster networking: management and storage networks, inter-node communication, DNS, and secure perimeter access, including bastion/jump host setup
  • Deep hands-on experience managing Linux-based infrastructure, including networking, firewalls, VPNs, and performance tuning in distributed environments
  • Own disaster recovery and business continuity: define RTO/RPO targets, maintain runbooks, and run regular tests
  • Manage and optimize infrastructure spend through capacity planning, right-sizing, and intelligent use of reserved and spot capacity
  • Manage Kubernetes clusters, networking, and workload scheduling across cloud and on-premise environments
  • Enable infrastructure-as-code practices in Terraform; drive consistency, modularity, and auditability across the codebase
  • Evolve our observability platform: improve coverage, reduce alert noise, and ensure engineering teams have the visibility they need to detect and resolve issues quickly
  • Own security posture across the platform: IAM policies, secrets management, network segmentation, vulnerability management, and SOC 2 compliance
  • Lead incident management: on-call processes, escalation policies, runbooks, and blameless post-mortems
  • Drive CI/CD improvements and developer workflow initiatives that meaningfully increase engineering throughput
  • Evolve internal tooling and CLI infrastructure that engineering teams depend on daily. 
Values & Working Style
  • Ownership mindset — you take responsibility from A to Z
  • Comfortable navigating ambiguity in a fast-moving startup environment
  • Clear communicator who can collaborate across technical and non-technical teams
  • Pragmatic problem solver focused on impact.

Why This Role Matters Now

As we scale our AI platform and expand into new initiatives, engineering velocity and platform reliability directly impact research outcomes and product milestones. This role plays a key part in strengthening our technical foundation during the growth phase.

Similar Jobs

Yesterday
In-Office
Senior level
Senior level
Software
Develop core methods for temporal ontology extraction and contradiction detection. Collaborate with research teams and publish findings. Prototype ideas, create datasets, and validate methods.
Top Skills: Knowledge GraphsMachine LearningNatural Language ProcessingOntology LearningPythonPyTorch
Yesterday
In-Office
Senior level
Senior level
Software
Lead the vision and execution for AI analytics in JetBrains Console, focusing on customer adoption, usage, cost, and decision-making.
Top Skills: AIAnalyticsB2B SoftwareProduct ManagementSoftware Development
Yesterday
Hybrid
Junior
Junior
Information Technology
As a Sales Executive, you will sign up local stores and manage the sales pipeline, ensuring smooth onboarding and continuous restaurant performance evaluation.

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account