Opens nvidia.wd5.myworkdayjobs.com in a new tab
What You'll Do
- will include using cutting-edge automation technologies and AI based solutions to developing infrastructure capabilities spanning our internal cloud grid, provisioning pipelines, kernel and driver verification environments, and high-performance networking setups.We are looking for a motivated teammate who isn't afraid of learning new technologies, tackle complicated debugs, work closely with internal R&D teams and develop modern tools to make constant improvements to our server fleet and automation infrastructure.
- What you'll be doing: Join our infrastructure team and develop best-in-class automation solutions for bare-metal server provisioning and network driver verification.
- Build and maintain Ansible playbooks and roles for full server lifecycle management — from OS installation and kernel configuration to OFED driver setup and production readiness.
- Develop Python solutions for hardware introspection, REST API integration, inventory management, and resource allocation across the server fleet.
- Develop virtualization and system capabilities — KVM, QEMU, libvirt, Vagrant, Docker, and Kubernetes — across a variety of operating systems and hardware architectures (x86\ 64, aarch64, ppc64le).
- Build and maintain Jenkins CI/CD pipelines (Groovy/Jenkinsfile) that orchestrate the full provisioning workflow from BIOS configuration through Ansible provisioning to automated validation.
- Be a part of an experienced team with a great atmosphere.
- Collaborate with multiple cross-domain teams — verification engineers, hardware teams, and cloud engineers — to provide the best infrastructure solutions to our customers.
- What we need to see: B.Sc. (or equivalent experience) in Computer Engineering, Computer Science, or a related technical field. 5+ years of experience in the field of Linux systems administration, infrastructure automation, or DevOps.
- Background in designing, implementing, and debugging automation software.
- Strong debugging and analytical skills.
- Experience in Python — scripting, REST API clients, subprocess management, and pip package management.
- Solid understanding of Linux — systemd, package management (dnf/yum, apt, zypper), kernel parameters, GRUB, sysctl tuning, NFS, and service management.
- Agility and multitasking.
- Strong collaboration and communication skills with peer and internal customers.
- Ways to stand out from the crowd: Experience with Ansible (playbooks, roles, tags, idempotency) and infrastructure-as-code principles as well as background with Kubernetes, Vagrant (vagrant-libvirt), Docker, and KVM/QEMU/libvirt virtualization stacks.
- Familiarity with NVIDIA/Mellanox hardware — ConnectX NIC series, BlueField DPUs, MFT (Mellanox Firmware Tools), and RSHIM driver configuration.
- Hands-on experience with hardware management APIs such as Redfish (Dell iDRAC / HP iLO) and IPMI for automated BIOS and BMC configuration.
- Experience with performance tuning — hugepages, DPDK, NUMA, CPU pinning — for virtualization and high-performance networking workloads
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Israel-Yokneam/Senior-Linux-Infrastructure-and-Automation-Engineer_JR2015329
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Staff Technical Program Manager
San Jose, California, United States|Other
Samsung Semiconductor
Associate, Executive Administration
San Jose, California, United States|Other
Micron Technology
STAFF ENGINEER GFAC SASIA - ELECTRICAL
Fab 10A, Singapore|Other
Micron Technology
TEST HBM DATA ANALYST
Taichung - MTB, Taiwan|Other