Opens nvidia.wd5.myworkdayjobs.com in a new tab
Overview
- We are looking for DL engineers passionate about building deep learning frameworks for large language (LLM) and vision language (VLM) model compression that push the boundaries of AI efficiency.
- In this role, you’ll collaborate with world-class teams across NVIDIA to advance both the software and hardware stack that powers modern AI.
- Join the team building software used by the entire world.
- Work with world class engineers and researchers to build next-generation deep learning frameworks for compressing LLM and VLM models through pruning, distillation, and neural architecture search (NAS).
- Work on most powerful, enterprise-grade GPU clusters capable of hundreds of Peta FLOPS and on unreleased hardware before anyone in the world.
- Are you ready for this challenge? What you’ll be doing: Design and implement a deep learning framework for compressing large language and vision-language models to deliver highly optimized, high-performance AI systems used worldwide.
- Develop and integrate new algorithms for pruning, NAS, and distillation in collaboration with NVIDIA researchers and engineers.
- Experiment with compressing the latest LLMs and VLMs, analyzing their performance and behavior across diverse workloads.
- Collaborate with researchers and engineers across NVIDIA, providing guidance on improving the design, usability and performance of workloads.
- Lead best-practices for building, testing, and releasing DL software.
- What we need to see: 8+ years of experience in Deep Learning and SW Development.
- BSc, MS or PhD degree in Computer Science, Computer Architecture or related technical field.
- Hands-on experience with LLM or VLM model training or inference.
- Excellent Python programming skills.
- Extensive knowledge of at least one DL Framework (PyTorch, TensorFlow, JAX, MxNet) with practical experience in PyTorch required.
- Strong problem solving and analytical skills.
- Algorithms and DL fundamentals.
- Ways to stand out from the crowd: Experience applying and implementing model compression techniques such as pruning, NAS, distillation, and quantization.
- Experience building deep learning frameworks for training, inference, model compression, or related topic.
- GPU programming experience (CUDA or OpenCL) is a plus but not required.
- First-author publication in a top-tier deep learning or AI conference.
- NVIDIA is widely considered to be one of the technology world’s most desirable employers.
- We have some of the most brilliant and forward-thinking people in the world working for us.
- We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
- Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
- For Poland: The base salary range is 292,500 PLN - 507,000 PLN for Level 4, and 375,000 PLN - 650,000 PLN for Level 5.


#deeplearning.
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Poland-Warsaw/Deep-Learning-Engineer---LLM-and-VLM-Model-Compression_JR2014941
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Staff Technical Program Manager
San Jose, California, United States|Other
Samsung Semiconductor
Associate, Executive Administration
San Jose, California, United States|Other
Micron Technology
STAFF ENGINEER GFAC SASIA - ELECTRICAL
Fab 10A, Singapore|Other
Micron Technology
TEST HBM DATA ANALYST
Taichung - MTB, Taiwan|Other