Senior Solutions Architect, Customer Success

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Overview

  • NVIDIA is looking for a Senior Solutions Architect, Customer Success to join its NVIDIA Infrastructure Specialist Team.
  • Academic and commercial organizations around the world are using NVIDIA products to redefine deep learning and data analytics, and to power next-generation data centers.
  • Join the team building and advising on many of the largest and fastest AI/HPC systems in the world! We are looking for someone who blends deep technical expertise with a consultative, collaborative approach.
  • This role will engage directly with customers, partners, and multi-functional internal teams to assess infrastructure needs, architect scalable solutions, and guide the implementation of large-scale networking and AI infrastructure projects.
  • The scope spans networking, system design, and automation—serving as a trusted strategic advisor and the technical face of NVIDIA to key accounts.
  • What You’ll Be Doing: Serve as a senior technical authority and trusted consultant on NVIDIA technologies, contributing to architecture reviews, guiding infrastructure decisions at scale, and providing strategic recommendations aligned with each customer’s business objectives.
  • Establish and refine monitoring and optimization methodologies using analytics, telemetry, and automation to proactively detect bottlenecks, improve infrastructure resiliency, and drive continuous operational maturity.
  • Lead and advise on the analysis, optimization, and performance tuning of complex GPU-accelerated systems and AI workloads, ensuring high availability and efficiency across customer data centers.
  • Facilitate post-deployment reviews, incident retrospectives, and strategy sessions to shape the customer experience and deliver actionable insights into NVIDIA’s infrastructure roadmap.
  • Own and lead complex technical projects end-to-end—from initial discovery and solution design through implementation, knowledge transfer, and continuous improvement—ensuring alignment to SLAs and proactive mitigation of technical risks.
  • Support business growth by identifying AI infrastructure opportunities in cloud and enterprise environments, crafting compelling technical proposals, and driving initiatives that showcase NVIDIA’s leadership in this space.
  • What We Need to See: Education & Experience: BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields, with 10+ years of professional experience in large-scale data center service operations with a focus on infrastructure.
  • NVIDIA GPU Expertise: Demonstrated hands-on experience deploying, configuring, and optimizing NVIDIA GPU-accelerated infrastructure, including driver and firmware management, CUDA toolkit integration, and GPU workload profiling and fix.
  • Customer Engagement: Track record of building long-term customer relationships and driving adoption through consultative engagement.
  • Analytical & Problem-Solving Skills: Strong analytical and decision-making capabilities, with a demonstrable ability to identify root causes, drive continuous improvement, and deliver resilient technical solutions.
  • System & Infrastructure Proficiency: Expertise in end-to-end data center architecture, spanning operating systems, Linux kernel drivers, GPU and NIC hardware, high-speed networking (InfiniBand, Ethernet, RDMA), and storage systems (Lustre, GPFS, NFS).
  • Leadership & Communication: Good communication, time management, and organizational skills, with the ability to lead complex multi-functional projects, guide technical teams, and present to executive partners.
  • Travel: Willingness to travel up to 25% for customer engagements.
  • Ways to Stand Out from the Crowd: Experience with Kubernetes for container orchestration, resource scheduling, and integration with GPU-accelerated workloads.
  • Familiarity with observability stacks (Grafana, Prometheus, Loki) for monitoring, alerting, and building fault-tolerant systems.
  • Experience with multi-tenant GPU cluster management and workload scheduling frameworks.
  • Experience with NVIDIA Base Command Manager (BCM) for provisioning, managing, and monitoring GPU clusters at scale.
  • Background with RDMA-based fabrics (InfiniBand or RoCE) in HPC or AI environments as well as knowledge of CI/CD pipelines, Infrastructure-as-Code (Terraform, Ansible), and GitOps workflows for infrastructure automation.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

2 Locations

Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/UAE-Dubai/Senior-Solutions-Architect--Customer-Success_JR2016438-1

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles