Opens nvidia.wd5.myworkdayjobs.com in a new tab
Overview
- NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years.
- It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.
- We are now seeking a highly motivated Infrastructure, Tools & AI Engineering Manager to join our Ethernet Switching group, working on SONiC Network OS.
- In this role, you will own and drive the engineering infrastructure that powers the full product development lifecycle — from development environments and CI pipelines through regression, code coverage, and test efficiency.
- You will apply cutting-edge AI and LLM capabilities to transform how we analyze failures, generate test coverage, and accelerate product quality.
- What you’ll be doing: Design, build, and maintain scalable infrastructure for development, integration, and test environments supporting SONiC OS.
- Architect and deliver LLM-based tools for intelligent regression analysis — failure classification, root cause clustering, anomaly detection, and test flakiness prediction Lead efforts to reduce regression runtime through parallelization, smart test selection, and dependency-aware scheduling Develop deep technical knowledge of SONiC Network OS internals, including its subsystem architecture, SAI/ASIC abstraction layer, and management plane Lead and mentor a team of infrastructure and tooling engineers; set technical direction, define priorities, and grow team capabilities What we need to see: B.Sc.
- degree or higher in Computer Science, Software Engineering, or a related field — or equivalent experience 8+ overall years of software engineering experience, with at least 3 years in an infrastructure, DevOps, or tooling leadership role Strong Python programming skills; experience building production-quality automation frameworks and tooling Demonstrated experience designing and operating CI/CD systems at scale (Jenkins, GitLab CI, GitHub Actions, or equivalent) Hands-on experience with LLMs or AI-assisted developer tooling — building, integrating, or productizing AI capabilities in an engineering workflow Proven ability to lead technical teams: hiring, mentoring, technical roadmapping, and cross-team influence Strong analytical and problem-solving skills with a bias toward measurable outcomes and data-driven decisions Ways to stand out from the crowd: Deep Linux expertise: system internals, networking stack, process management, and scripting Prior experience building LLM-powered test analysis pipelines or AI-enhanced DevOps tooling in a real production environment Knowledge of networking protocols and hardware: Ethernet switching, L2/L3 protocols, QoS, VLANs, high-performance data center networking Experience with code coverage instrumentation in large-scale C/Python codebases and using coverage data for test prioritization Track record of measurably improving regression runtime, test reliability, or CI throughput in a complex embedded or systems software environment.
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Israel-Raanana/Manager--Infra-Tools-AI_JR2016759
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Thermal Engineer
San Jose, California, United States|Other
Samsung Semiconductor
Senior Manager, OLED Field Applications Engineering
San Jose, California, United States|Other
Samsung Semiconductor
Compensation Partner
San Jose, California, United States|Other
Micron Technology
HVM PEE Bench Operation Equipment Technician (内製修理テクニシャン)
Hiroshima - Fab 15, Japan|Other