Opens nvidia.wd5.myworkdayjobs.com in a new tab
What You'll Do
- will include building AI/HPC infrastructure for new and existing customers.
- Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting.
- Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
- What We Need To See BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
- At least 5+ years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture Proficiency in configuring, testing, validating, and resolving issues in LAN networks, especially in medium to large-scale HPC/AI environments.
- Advanced knowledge of EVPN, BGP, OSPF, VXLAN protocols.
- Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS.
- Extensive experience delivering automated network provisioning solutions using tools like Ansible, Salt, and Python.
- Ability to develop CI/CD pipelines for network operations.
- Strong focus on customer needs and satisfaction.
- Self-motivated with leadership skills to work collaboratively with customers and internal teams.
- Strong written, verbal, and listening skills in English are essential.
- Ways To Stand Out From The Crowd Familiarity with cloud networks (AWS, GCP, Azure) is a plus.
- Linux or Networking Certifications.
- Experience with High-performance computing architectures.
- Understanding of how job schedulers (Slurm, PBS) work.
- Cluster management technologies knowledge (bonus credit for BCM (Base Command Manager).) Experience with GPU (Graphics Processing Unit) focused hardware/software.
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
95 positions
Job ID
/job/India-Mumbai/Senior-Solutions-Architect-Networking_JR2017796
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Senior Manager, Memory Sales
San Jose, California, United States|Other
Samsung Semiconductor
Senior Engineer, DRAM Applications
San Jose, California, United States|Other
Samsung Semiconductor
Director, SMB Memory
San Jose, California, United States|Other
Micron Technology
Senior / Principal DRAM Product Development Engineer – DEG Technology
Boise, ID - Main Site|Other