Deep Learning Solution Architect

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Benefits

  • package, we are widely considered to be one of the world’s most desirable employers! We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our best-in-class engineering teams are rapidly growing.
  • If you're a creative and autonomous person with a real passion for technology, we want to hear from you.
  • What You Will Be Doing: Architect end-to-end solutions focused on LLM pretraining, fine-tuning, high-performance inference, RAG workflows, and agentic inference orchestration using NVIDIA’s hardware and software platforms.
  • Collaborate with customers to understand their LLM-related business challenges and design tailored solutions aligned with the NVIDIA ecosystem.
  • Lead LLM training, distributed optimization, and performance tuning to achieve optimal throughput, latency, and memory efficiency.
  • Design and integrate RAG workflows and agentic inference pipelines into customer systems; provide technical guidance on best practices.
  • Collaborate with NVIDIA engineering teams to provide feedback and support pre-sales technical activities (workshops, demos).
  • What We Need to See: Master’s / Ph.D. in Computer Science, Artificial Intelligence, or equivalent experience. 4+ years hands-on experience in AI, focusing on open-source LLM training, fine-tuning, and production inference optimization.
  • Deep understanding of mainstream LLM architectures and proficiency in LLM customization via PyTorch, Hugging Face Transformers.
  • Solid knowledge of GPU computing, cluster architecture, and distributed parallel training/inference for LLMs.
  • Competency in agentic inference design and using AI agents to solve business challenges.
  • Strong communication skills, able to articulate complex technical concepts to technical and non-technical stakeholders.
  • Ways to Stand Out from the Crowd: Hands-on experience with NVIDIA’s generative AI ecosystem (TRT-LLM, Megatron-LM, NVIDIA NeMo).
  • Advanced skills in LLM optimization (quantization, KV Cache tuning, memory footprint reduction).
  • Experience with Docker, Kubernetes for containerized LLM and agent workflow deployment on-prem.
  • In-depth knowledge of multi-GPU parallelism and large-scale GPU cluster management. #deeplearning

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

2 Locations

Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/China-Beijing/Deep-Learning-Solution-Architect_JR2015520-1

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles