NVIDIA Logo

NVIDIA

Senior Solutions Architect, Continuous Bringup and Optimization- NVIS

Reposted 25 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Japan
Senior level
Remote
Hiring Remotely in Japan
Senior level
Lead and optimize GPU-accelerated systems and AI workloads, engage with customers to drive infrastructure initiatives, and ensure deployment success.
The summary above was generated by AI

NVIDIA is the world leader in computer graphics, artificial intelligence, and accelerated computing. For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology. Our history of innovation drives us to solve the worlds hardest problems.

NVIDIA is looking for Senior Industry SA/Customer Success/Partnership Solutions Architect to join its NVIDIA Infrastructure Specialist Team. Academic and commercial groups around the world are using NVIDIA products to redefine deep learning and data analytics, and to power data centers. We are building many of the largest and fastest AI/HPC systems in the world! We are looking for someone with the ability to work on a dynamic customer focused team that requires excellent social skills. This role will be interacting with customers, partners and internal teams, to analyze, define and implement large scale Networking projects. The scope of these efforts includes a combination of Networking, System Design and Automation and being the face to the customer!

What you'll be doing:

  • Lead the hands-on analysis, optimization, and performance tuning of complex GPU-accelerated systems and AI workloads, ensuring high availability and efficiency across customer data centers.

  • Engage with NVIDIA strategic customers to drive AI infrastructure initiatives, support deployment success, and influence long-term platform adoption.

  • Serve as a senior technical authority on NVIDIA GPU, DPU, and networking technologies, contributing to architecture reviews and guiding infrastructure decisions at scale.

  • Collaborate with internal Engineering, Product, and Sales teams to align customer deployments with NVIDIA’s technology roadmap and business objectives.

  • Establish and refine monitoring and optimization methodologies using analytics, telemetry, and automation to detect bottlenecks and improve infrastructure resiliency.

  • Participate in post-deployment reviews, incident retrospectives, and strategic planning sessions to shape the customer experience and feed insights into NVIDIA’s infrastructure strategy.

  • Complete and lead complex technical projects from initial design through implementation and continuous improvement, ensuring alignment to SLAs and mitigation of technical risks.

  • Support business growth by identifying AI infrastructure opportunities in cloud and enterprise environments and driving technical initiatives that showcase NVIDIA’s leadership in this space.

What we need to see:

  • 10+ years of experience in large-scale data center service operations with a focus on infrastructure performance, backed by a Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.

  • Strong analytical, solving problems, and decision-making skills, capable of identifying root causes, driving continuous improvement, and delivering resilient technical solutions.

  • Strong communication, time management, and organizational skills, with the ability to lead complex projects, guide technical teams, and meet important metrics.

  • Preferred certifications in data center, server, or networking technologies, and a willingness to travel up to 25% for customer engagements and team collaboration.

  • Proficiency in system-level aspects, encompassing Operating Systems, Linux kernel drivers, GPUs, NICs, and hardware architecture.

  • Demonstrated expertise in cloud orchestration software and job schedulers, including platforms like Kubernetes, Docker Swarm, and HPC-specific schedulers such as Slurm.

  • Familiarity with cloud-native technologies and their integration with traditional infrastructure is crucial.

  • Proficiency in both Japanese and English, with the ability to communicate complex technical topics clearly across multicultural teams and with customers.

Ways to stand out from the crowd:

  • Deep familiarity with AI infrastructure and workflows, including training/inference pipelines, MLOps/DevOps tools, containerization (Docker, Kubernetes), and large-scale system deployments.

  • Knowledge of data center infrastructure operations, including safety, security, environmental controls, and standard operating procedures.

  • Proven expertise in scaling complex systems, with deep experience in automation, orchestration, and performance optimization across compute, storage, and networking layers.

  • Good interpersonal and collaboration skills, with the ability to lead discussions, influence outcomes, and build positive relationships with both internal and external collaborators.

Top Skills

Docker
Dpu
Gpu
Kubernetes
Linux
Networking Technologies
Operating Systems
Slurm

Similar Jobs

Yesterday
Remote or Hybrid
Tokyo, JPN
Senior level
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Senior Technical Support Engineer resolves customer issues with ServiceNow software, providing support via various communication channels and ensuring high-quality customer experiences.
Top Skills: JavaJavaScriptLinuxServicenowWindows Server
Yesterday
Remote or Hybrid
Tokyo, JPN
Junior
Junior
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
The Customer Support Specialist engages with Square users via phone and email, assisting with issue resolution and providing feedback to improve product quality. Their role is to ensure users can effectively use Square tools, facilitating a seamless support experience.
Yesterday
Remote or Hybrid
Tokyo, JPN
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
Manage and grow relationships with channel partners in Japan, drive sales strategy, lead generation, and market engagement while supporting the sales team and customer success initiatives.
Top Skills: Cybersecurity

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account