Senior HPC Software Engineer

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Overview

  • Join our team as a Senior HPC software engineer.
  • At NVIDIA, you'll be part of the team shaping the future of computing and guaranteeing the smooth operation of our brand-new technologies.
  • Our mission is to leverage AI's power to build outstanding and pioneering solutions that have a significant impact on the world.
  • What you'll be doing: Own the solutions you build, collaborating with cross-functional teams to successfully implement them.
  • Collaborate with various teams in a fast-paced environment to ensure seamless project completion.
  • Continuously improve solution provisioning and management through automation.
  • Detect performance issues and recommend solutions to maintain world-class service quality.
  • Conduct capacity management and planning to meet ongoing operational needs.
  • Participate in incident reviews, assist in root cause identification, and write RCA reports.
  • Deliver SRE solutions in a globally distributed, multi-cloud hybrid environment - AWS, GCP, and On-prem.
  • Participate in the team's on-call rotation.
  • What we need to see: B.S.
  • degree in Computer Science or related technical field (or equivalent experience) 8+ years in building and supporting critical services 5+ years of coding/scripting experience in at least two high-level programming languages such as Python, Go, Ruby, or Groovy.
  • Proficiency in Kubernetes administration, modern CI/CD techniques and Infrastructure as Code (IaC).
  • Full-stack AI experience with deep expertise in MCP ecosystems, Carpenter, n8n orchestration, and AI-assisted development via Cursor.
  • Expertise with at least one major cloud service provider - AWS, GCP, Azure.
  • Demonstrated proficiency with end-to-end SRE capabilities and observability.
  • Proficient in monitoring, metrics gathering, APM, container management, and log collection tools.
  • Creative problem solver with excellent debugging skills and great communication and documentation abilities.
  • Ways to stand out from the crowd: Linux certification from a well-known vendor - RedHat, Oracle, etc.
  • Prior experience managing large-scale Kubernetes deployment in production.
  • Strong skills in modern container networking and storage architecture.
  • Hands-on background working with Flexlm and license management system.
  • Hands-on experience working with Slurm/LSF environments.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

Israel, Yokneam

Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Israel-Yokneam/Senior-HPC-Software-Engineer_JR2013652

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles