Senior Systems Software Engineer - GPU Performance at Scale

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Overview

  • NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years.
  • It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.
  • Today, we’re tapping into the unlimited potential of AI to define the next era of computing.
  • An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world.
  • Doing what’s never been done before takes vision, innovation, and the world’s best talent.
  • As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work.
  • Come join the team and see how you can make a lasting impact on the world.
  • We are looking for a dedicated engineer for the Senior Systems Software Engineer role, focusing on GPU Performance at Scale.
  • At NVIDIA, this role is uniquely positioned to drive innovation in AI and GPU computing.
  • You will contribute to world-class computing hardware and software, fueling groundbreaking advancements in artificial intelligence.
  • You will provide insights on large-scale system composition and tuning mechanisms for high-performance compute runs.
  • Collaborate with researchers, developers, and customers to craft improved workflows and develop new, leading solutions.
  • Engage with HPC, OS, CPU, GPU compute, and systems specialists to architect, build, and optimize large-scale performance platforms.
  • What you'll be doing: Lead the implementation of performance practices in large-scale GPU infrastructure, delivering powerful tools, methodologies, and flows to validate and improve multiple datacenter products concurrently.
  • Align next-generation AI workloads with next-generation datacenter builds for NVIDIA GPUs, CPUs, and networking hardware.
  • Engage early with HW/FW/SW/platform internal and customer teams.
  • Develop engineering solutions that provide continuous insights into the performance of AI workloads in evolving environments, generating swift insights into improvements and regressions.
  • Decompose high-complexity performance or stability issues into minimal reproduction cases, working towards identifying the root cause.
  • Participate in collaborations with various SW and FW teams (BMC/SBIOS/OS/drivers, etc.) to develop outstanding methods and tools.
  • Analyze, debug, and resolve critical firmware and software issues to achieve the highest AI workload performance at scale.
  • What we need to see: Proven understanding of accelerated computing software stacks (CUDA).
  • Experience with modern cloud and container-based enterprise computing architectures, with Slurm preferred.
  • Strong programming and scripting experience in C/C++/Python/Bash.
  • Deep expertise in systems architecture and the impact of various components on performance.
  • Experience with container technology and Linux-based OSes, with Docker preferred.
  • Experience supporting high-performance computing or deep learning in engineering or academic research communities.
  • Strong teamwork and communication skills, coupled with results-focused analytical abilities.
  • BS in Engineering, Mathematics, Physics, or Computer Science (or equivalent experience); MS or PhD desirable with 10+ years of applicable experience.
  • Ways to Stand Out From the Crowd End-to-end GPU performance engineering from the profiler to systems analysis.
  • Linux systems programming and optimization experience.
  • Exposure to virtualization techniques and cloud platform solutions.
  • Experience with scheduling and resource management systems.
  • Experience with large-scale HPC environments.
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
  • For Poland: The base salary range is 292,500 PLN - 507,000 PLN for Level 4, and 375,000 PLN - 650,000 PLN for Level 5.


.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

2 Locations

Specialisation
Open roles at NVIDIA
1998 positions
Job ID
/job/Switzerland-Remote/Senior-Systems-Software-Engineer---GPU-Performance-at-Scale_JR2018155-1

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles