Opens nvidia.wd5.myworkdayjobs.com in a new tab
What You'll Do
- will include building AI/HPC infrastructure for new and existing customers.
- Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting.
- Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
- What we need to see: BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
- At least 8 years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture Proficiency in configuring, testing, validating, and resolving issues in LAN and InfiniBand networks, especially in medium to large-scale HPC/AI environments.
- Advanced knowledge of EVPN, BGP, OSPF, VXLAN protocols.
- Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS.
- Extensive experience delivering automated network provisioning solutions using tools like Ansible, Salt, and Python.
- Ability to develop CI/CD pipelines for network operations.
- Strong focus on customer needs and satisfaction.
- Self-motivated with leadership skills to work collaboratively with customers and internal teams.
- Strong written, verbal, and listening skills in English are essential.
- Ways to stand out from the crowd: Familiarity with cloud networks (AWS, GCP, Azure) is a plus.
- Linux or Networking Certifications.
- Experience with High-performance computing architectures.
- Understanding of how job schedulers(Slurm, PBS) work. luster management technologies knowledge (bonus credit for BCM (Base Command Manager).) Experience with GPU (Graphics Processing Unit) focused hardware/software.
- With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the most desirable employers in the world.
- We have some of the most brilliant and talented people in the world working for us.
- If you are creative, autonomous and love a challenge, we want to hear from you.
- We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. #LI-Hybrid
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Taiwan-Taipei/Senior-Solutions-Architect--Infiniband-and-Networking-Ethernet_JR1998378
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Staff Technical Program Manager
San Jose, California, United States|Other
Samsung Semiconductor
Associate, Executive Administration
San Jose, California, United States|Other
Micron Technology
STAFF ENGINEER GFAC SASIA - ELECTRICAL
Fab 10A, Singapore|Other
Micron Technology
TEST HBM DATA ANALYST
Taichung - MTB, Taiwan|Other