Senior Software Engineer, Data Center Workloads – Infrastructure

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Overview

  • At NVIDIA, we are pioneers in innovation, transforming computer graphics, PC gaming, and accelerated computing for over 25 years.
  • Our team is driven by powerful technology and outstanding people who expand the limits of what’s achievable.
  • Now, we are unlocking the potential of AI to usher in the next era of computing.
  • As part of our engineering organization, you will play a key hands-on role in developing and executing software-driven characterization workflows on NVIDIA rack-scale systems.
  • This role is focused on running AI workloads across the full stack to analyze, characterize, and optimize power, performance, and drive behavior at system level.
  • This is an opportunity to work at the intersection of software, infrastructure, silicon, and large-scale AI platforms, with direct impact on next-generation NVIDIA systems.
  • What you’ll be doing: Develop and run software tools, automation, and workloads to characterize power, performance, and drive behavior across NVIDIA rack-scale systems.
  • Execute AI and system-level workloads to stress and evaluate behavior across the stack, including GPUs, CPUs, networking, storage, firmware, drivers, and system software.
  • Build automated frameworks for data collection, telemetry, validation, correlation, and analysis of characterization results.
  • Investigate system behavior under different workloads and operating conditions to identify bottlenecks, anomalies, and optimization opportunities.
  • Work closely with hardware, firmware, driver, system software, performance, and validation teams to define characterization methodologies and debug cross-stack issues.
  • Support bring-up, validation, and readiness activities for new rack-scale platforms and AI infrastructure.
  • Create clear documentation, test flows, and repeatable processes to improve coverage, efficiency, and reproducibility.
  • What we need to see: B.Sc.
  • in Computer Science, Electrical Engineering, or a related field.
  • 5+ years of software engineering experience, preferably in system software, infrastructure, validation, or performance-focused environments.
  • Strong programming skills in Python and at least one system-level language such as C/C++.
  • Experience developing automation and test infrastructure for complex hardware/software systems.
  • Hands-on experience running, debugging, or optimizing AI, HPC, or large-scale system workloads.
  • Good understanding of system-level architecture, including interactions across hardware, firmware, drivers, operating systems, and application layers Experience working in Linux environments and with scripting, telemetry, logging, and data analysis tools.
  • Strong debugging and problem-solving skills, with the ability to work across multiple engineering disciplines.
  • Good communication skills and the ability to drive technical work in a fast-paced, cross-functional environment.
  • Ways to stand out from the crowd: Experience with NVIDIA platforms, GPU systems, or rack-scale AI infrastructure.
  • Background in power, thermal, performance, or storage/drive characterization.
  • Experience with workload automation, cluster orchestration, or lab infrastructure.
  • Familiarity with AI benchmarks, training/inference workloads, and system stress methodologies.
  • Experience in post-silicon validation, production testing, or system bring-up.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

Israel, Yokneam

Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Israel-Yokneam/Senior-Software-Engineer--Data-Center-Workloads---Infrastructure_JR2017132

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles