Deep Learning Algorithms Engineer - ACOT

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Overview

  • NVIDIA is seeking a motivated AI Acceleration & Optimization Engineer to join our Acceleration Computing, Optimization and Tools (ACOT) team.
  • In this role, you will help improve the performance, scalability, and efficiency of modern AI models across NVIDIA GPU platforms.
  • You will work with engineers across algorithms, systems, and hardware to support high-performance model deployment and development for real-world AI workloads.
  • As part of ACOT, you will collaborate with architecture, research, CUDA, compiler, and framework teams to help bring next-generation AI workloads from research to production with strong performance and reliability.
  • What you will be doing Assist in optimizing AI models such as LLMs, VLMs, diffusion models, and multimodal models for inference and training on NVIDIA GPUs.
  • Profile workloads and help identify performance bottlenecks across GPU compute, memory, networking, and storage.
  • Support the development and integration of optimization techniques such as quantization, kernel fusion, parallelism, and memory efficiency improvements.
  • Use tools including CUDA, TensorRT, Nsight, and NVIDIA acceleration libraries to analyze and improve model performance.
  • Work with deep learning frameworks including PyTorch, JAX, and TensorFlow, as well as open-source inference frameworks like vLLM and SGLang.
  • Contribute to performance benchmarking, testing, and internal tooling to improve optimization workflows.
  • Partner with senior engineers and multi-functional teams to evaluate workload behavior and support future performance improvements.
  • What we want to see Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Computer Engineering, or related field (or equivalent experience).
  • 2–4 years of experience, or strong academic/project experience, in deep learning, performance engineering, systems, or high-performance computing.
  • Good understanding of deep learning fundamentals and modern AI model architectures, especially transformers.
  • Familiarity with GPU architecture and parallel computing concepts such as CUDA, kernels, memory hierarchy, and streams.
  • Exposure to profiling and performance analysis tools.
  • Programming skills in Python.
  • Experience with at least one major ML framework such as PyTorch, TensorFlow, or JAX.
  • Ways to stand out from the crowd Internship, research, or project experience optimizing AI/ML workloads on GPUs.
  • Hands-on experience with TensorRT, TensorRT-LLM, vLLM, SGLang, or similar inference/runtime frameworks.
  • Familiarity with quantization, sparsity, or mixed-precision techniques.
  • Experience with distributed training or inference concepts.
  • Contributions to open-source ML systems, performance tools, or infrastructure projects.
  • Proficiency in C++, strong debugging skills and interest in low-level performance optimization.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

2 Locations

Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Vietnam-Ho-Chi-Minh-City/Deep-Learning-Algorithms-Engineer---ACOT_JR2015256

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles