Opens nvidia.wd5.myworkdayjobs.com in a new tab
Overview
- NVIDIA is looking for a Senior Solutions Architect, Customer Success to join its NVIDIA Infrastructure Specialist Team.
- Academic and commercial organizations around the world are using NVIDIA products to redefine deep learning and data analytics, and to power next-generation data centers.
- Join the team building and advising on many of the largest and fastest AI/HPC systems in the world! We are looking for someone who blends deep technical expertise with a consultative, collaborative approach.
- This role will engage directly with customers, partners, and multi-functional internal teams to assess infrastructure needs, architect scalable solutions, and guide the implementation of large-scale networking and AI infrastructure projects.
- The scope spans networking, system design, and automation—serving as a trusted strategic advisor and the technical face of NVIDIA to key accounts.
- What You’ll Be Doing: Serve as a senior technical authority and trusted consultant on NVIDIA technologies, contributing to architecture reviews, guiding infrastructure decisions at scale, and providing strategic recommendations aligned with each customer’s business objectives.
- Establish and refine monitoring and optimization methodologies using analytics, telemetry, and automation to proactively detect bottlenecks, improve infrastructure resiliency, and drive continuous operational maturity.
- Lead and advise on the analysis, optimization, and performance tuning of complex GPU-accelerated systems and AI workloads, ensuring high availability and efficiency across customer data centers.
- Facilitate post-deployment reviews, incident retrospectives, and strategy sessions to shape the customer experience and deliver actionable insights into NVIDIA’s infrastructure roadmap.
- Own and lead complex technical projects end-to-end—from initial discovery and solution design through implementation, knowledge transfer, and continuous improvement—ensuring alignment to SLAs and proactive mitigation of technical risks.
- Support business growth by identifying AI infrastructure opportunities in cloud and enterprise environments, crafting compelling technical proposals, and driving initiatives that showcase NVIDIA’s leadership in this space.
- What We Need to See: Education & Experience: BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields, with 10+ years of professional experience in large-scale data center service operations with a focus on infrastructure.
- NVIDIA GPU Expertise: Demonstrated hands-on experience deploying, configuring, and optimizing NVIDIA GPU-accelerated infrastructure, including driver and firmware management, CUDA toolkit integration, and GPU workload profiling and fix.
- Customer Engagement: Track record of building long-term customer relationships and driving adoption through consultative engagement.
- Analytical & Problem-Solving Skills: Strong analytical and decision-making capabilities, with a demonstrable ability to identify root causes, drive continuous improvement, and deliver resilient technical solutions.
- System & Infrastructure Proficiency: Expertise in end-to-end data center architecture, spanning operating systems, Linux kernel drivers, GPU and NIC hardware, high-speed networking (InfiniBand, Ethernet, RDMA), and storage systems (Lustre, GPFS, NFS).
- Leadership & Communication: Good communication, time management, and organizational skills, with the ability to lead complex multi-functional projects, guide technical teams, and present to executive partners.
- Travel: Willingness to travel up to 25% for customer engagements.
- Ways to Stand Out from the Crowd: Experience with Kubernetes for container orchestration, resource scheduling, and integration with GPU-accelerated workloads.
- Familiarity with observability stacks (Grafana, Prometheus, Loki) for monitoring, alerting, and building fault-tolerant systems.
- Experience with multi-tenant GPU cluster management and workload scheduling frameworks.
- Experience with NVIDIA Base Command Manager (BCM) for provisioning, managing, and monitoring GPU clusters at scale.
- Background with RDMA-based fabrics (InfiniBand or RoCE) in HPC or AI environments as well as knowledge of CI/CD pipelines, Infrastructure-as-Code (Terraform, Ansible), and GitOps workflows for infrastructure automation.
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/UAE-Dubai/Senior-Solutions-Architect--Customer-Success_JR2016438-1
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Principal Engineer, RFIC
San Jose, California, United States|Other
Micron Technology
Engineer, Production
Tainan, Taiwan|Other
Micron Technology
Senior data scientist
Fab 10A, Singapore|Other
Micron Technology
Member of Technical Staff (MTS), Machine Learning, SMAI
Fab 10A, Singapore|Other