Solution Architecture Intern, AI Infra - 2026

2 Locations Other

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Overview

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years.
It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.
Today, we’re tapping into the unlimited potential of AI to define the next era of computing.
An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world.
Doing what’s never been done before takes vision, innovation, and the world’s best talent.
As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work.
Come join the team and see how you can make a lasting impact on the world.
Join NVIDIA, a groundbreaking leader in AI computing and visual technologies, at the forefront of innovation.
As an AI in Industry Solution Architecture Intern, you'll be integral to our mission of redefining industries through AI and HPC.
Our Solution Architect team builds innovative AI computing platforms, analyzes applications, and delivers outstanding value to our customers.
This role offers a remarkable opportunity to harness NVIDIA's newest technologies to optimize large models, develop sophisticated AI workflows, and empower our clients with advanced AI solutions.
What you will be doing: Develop and gain in‑depth understanding of open‑source inference frameworks such as SGLang and vLLM; collaborate with the community on new features and operators, performance optimization, and model enablement.
Design and implement CUDA kernels/operators (e.g., GEMM, attention and related primitives) for efficient and scalable LLM inference and training.
Develop and optimize KV‑cache offloading frameworks for LLM scenarios, enabling multi‑level KV‑cache offloading and reuse on CPU/SSD/remote storage to accelerate inference (team project: https://github.com/taco-project/FlexKV).
Take ownership of R&D work related to compute performance in distributed training, continuously exploring methods and techniques for performance optimization.
Conduct in‑depth research on computational problems in machine learning, summarize common computational patterns and requirements, and develop sample code, acceleration libraries, or framework components.
What we need to see: Pursuing a Bachelor or Master or PhD in Electrical Engineering, Automation, Computer Science, Computational Mathematics, or related fields.
Strong interest in accelerated computing, parallel computing, and heterogeneous computing, and willingness to dive deep into these areas.
Solid programming skills; good understanding of data structures and general concepts of computer systems.
Strong ability to learn and adapt, with good skills in analyzing and formulating problems and exploring solutions independently.
Ways to stand out from the crowd: Familiarity with heterogeneous computing, distributed training, parallel computing, or other high‑performance computing areas.
Experience in performance analysis, performance modeling, or performance optimization, and contributions to open‑source frameworks.
Strong capability in defining new problems and exploring solution spaces; this is critical for the role.
Proficiency with AI‑assisted programming tools.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

Similar Other roles

Samsung Semiconductor

Solution Architecture Intern, AI Infra - 2026

Overview

More from NVIDIA (1995 roles)

Senior Solutions Architect, Infiniband and Networking Ethernet - NVIS

Senior Software Engineer, Applied AI

NPI PCB Technical Program Manager

NVIDIA

Get matched to roles like this

Similar Other roles

Senior Engineer, Trace-Driven Simulator Development

Senior Engineer, Process Development, Diffusion, NTI

Tactical Capacity Engineer

OCT Photo Mfg. Alignment (MA) Engineer/Sr Engineer