NVIDIA Logo

NVIDIA

Senior HPC Dev Ops Engineer

Posted 19 Days Ago
Be an Early Applicant
In-Office
Westford, MA
208K-397K Annually
Expert/Leader
In-Office
Westford, MA
208K-397K Annually
Expert/Leader
Transform next-generation HPC and quantum infrastructure into scalable architectures, oversee Linux administration, job scheduling, and workload optimization, and mentor teams.
The summary above was generated by AI

As the HPC Operations Engineer for the new Accelerated Quantum Center located in Boston area, you’ll be the linchpin in transforming next-generation HPC + Quantum infrastructure into robust, scalable architectures, balancing technical feasibility with customer needs. Engage with a team recognized for its industry leadership, advanced GPU supercomputers, and dedication to supporting our engineers, developers, and customers. You will help develop an innovative hybrid HPC + Quantum computing infrastructure featuring a substantial NVIDIA GB200 NVL72 GPU cluster (572 GPUs) that will be integrated with diverse quantum computing platforms. The lead HPC Engineer will offer technical mentorship, system administration, optimizing performance, and coordinating GPU compute and quantum system workloads.

What you'll be doing:

  • Build and operate a brand new hybrid compute environment spanning HPC and quantum systems.

  • Lead Linux provisioning, configuration management, and system tuning across hundreds of GPU nodes and supporting infrastructure.

  • Coordinate and optimize Slurm job scheduling — define policies, handle QoS, tune workloads, and help users translate research requirements into efficient batch workflows.

  • Coordinate data center tasks, partner with data center operations teams, connect with quantum lab.

  • Integrate and sustain container orchestration (e.g., Singularity, Docker, or Kubernetes for HPC) to back simulation workloads and quantum job processing.

  • Run storage environment consisting of Lustre, NFS, and Cloud storage.

  • Work closely with quantum engineering teams to merge quantum control nodes, orchestration gateways, and facilitate data exchange between HPC and quantum systems.

  • Addressing and improving performance for complex hybrid workloads, covering CUDA, MPI, and CUDA-Q applications.

  • Develop and automate operational workflows with Ansible, GitHub Actions, and CI/CD pipelines.

  • Support researchers and developers with environment setup, debugging, and performance profiling on NVIDIA hardware and quantum simulators as well as serve as the primary systems administrator and reliability owner for the GB200 GPU infrastructure.

What We Need to See:

  • Proven experience (12+ years) in HPC systems engineering or administration within large-scale Linux-based GPU environments.

  • Extensive expertise in Slurm, Linux systems administration, and proficiency in configuration management tools like Ansible or Base Command (previously known as Bright Computing).

  • Practical familiarity with NVIDIA GPU technologies, InfiniBand, RDMA, and high-speed networking configurations.

  • Proficiency in containerization and orchestration tools like Singularity, Docker, or Kubernetes.

  • Knowledge of data center operations, including rack power, cooling methods (such as liquid-cooled systems), and network management.

  • Ability to automate, script, install/compile applications, and optimize performance.

  • Bachelor’s degree in Computer Science, Electrical/Computer Engineering, Physics, or equivalent experience.

  • Outstanding problem-solving and diagnostic skills, and the ability to operate in a multidisciplinary, high-performance environment.

Ways to Stand Out from the Crowd:

  • Exposure or familiarity of quantum computing systems (neutral-atom, trapped-ion, or superconducting).

  • Programming experience in Python, C/C++, or Shell Scripting for automation or performance tuning.

  • Knowledge of CUDA, MPI, and parallel programming paradigms is required.

  • Familiarity with NVIDIA HPC SDK, cuQuantum, or CUDA-Q toolchains.

  • Background supporting scientific or R&D computing environments.

This is a career-defining opportunity to work at the frontier of quantum-classical hybrid computing. You’ll help architect and operate a flagship facility that fuses GPU-accelerated HPC with quantum processors — advancing discovery in materials, life sciences, and innovative research. This is your chance to create a significant impact and push the boundaries of what's possible!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 208,000 USD - 333,500 USD for Level 5, and 248,000 USD - 396,750 USD for Level 6.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until November 4, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

Ansible
C/C++
Cloud Storage
Cuda
Docker
Hpc
Kubernetes
Linux
Lustre
Mpi
Nfs
Nvidia Gb200 Nvl72 Gpu Cluster
Python
Quantum Computing
Shell Scripting
Slurm

Similar Jobs

2 Hours Ago
Hybrid
Boston, MA, USA
168K-252K Annually
Senior level
168K-252K Annually
Senior level
Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
As a Lead Product Manager for Developer Experience at Klaviyo, you'll drive the vision for the internal development environment, improving engineering velocity and productivity while making developer workflows faster and safer.
Top Skills: AISoftware DevelopmentSoftware Engineering
2 Hours Ago
Hybrid
Boston, MA, USA
232K-348K Annually
Expert/Leader
232K-348K Annually
Expert/Leader
Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Lead the content product vision and execution at Klaviyo, focusing on AI-enhanced tools for marketers. Mentor teams and ensure high performance across workflows, while driving strategy and collaboration.
Top Skills: AIContent Management SystemsDigital Asset Management
2 Hours Ago
Hybrid
Boston, MA, USA
124K-186K Annually
Senior level
124K-186K Annually
Senior level
Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
The Lead Program Manager will oversee operational solutions for R&D, drive AI-first processes, and collaborate with teams to ensure effective execution of initiatives.
Top Skills: Ai TechnologiesCodaJIRAProductboard

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account