Opens nvidia.wd5.myworkdayjobs.com in a new tab
Overview
- NVIDIA’s accelerated computing platform is the foundation of modern HPC and AI.At the core of this platform are the CUDA Core Libraries.
- C++ and Python libraries that enable developers to write fast, reliable, and scalable GPU-accelerated software! We are hiring a full-time Software Engineer to work on the CUDA Core Libraries that power GPU computing for both C++ and Python developers.
- This includes projects such as CCCL (Thrust, CUB, libcudacxx), cuda-python, and numba-cuda.
- You will join the team building the foundational libraries, algorithms, and language/runtime infrastructure that make CUDA a speed-of-light experience for developers across deep learning, scientific computing, and data analytics! What you’ll be doing: Develop and implement CUDA Core Libraries in C++ and/or Python, including parallel algorithms and idiomatic language bindings for core CUDA functionality.
- Compose, optimize, and evolve GPU algorithms and APIs, from high-level interfaces down to low-level performance tuning involving memory, parallelism, and synchronization.
- Own features end-to-end: develop, implementation, testing, benchmarking, documentation, and long-term maintenance.
- Improve developer experience across the stack: CI, tests, benchmarks, packaging, examples, and docs.
- Collaborate with senior CUDA engineers in design reviews, code reviews, and open-source-style workflows.
- Engage with real users through issues, performance investigations, and API feedback.
- What we need to see: BS, MS, or PhD in Computer Science, Computer Engineering, or a related field or equivalent experience.
- Minimum of 8+ years of related development experience Strong programming skills in C++, Python, or both, with proven interest in systems-level software (performance, memory, concurrency, API design).
- Solid understanding of modern C++ (templates, generics, standard library) and/or Python library development and packaging.
- Practical experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accelerated Python, or similar).
- Experience contributing to production software or open-source libraries, including testing, profiling, and code review.
- Ability to work independently, scope problems, and drive projects to completion.
- Clear written communication for technical design and documentation.
- Comfort navigating large, multi-language codebases (C++, Python, CMake, Pixi, CI systems).
- Ways to stand out from the crowd: Strong understanding of CPU/GPU architecture and how hardware details affect performance.
- Hands-on experience with CUDA C++, CUDA Python, PyTorch, JAX, Numba, CuPy, or similar GPU-accelerated stacks.
- Familiarity with Thrust, CUB, libcudacxx, or other modern C++/GPU libraries.
- Experience with compiler infrastructure or tooling (LLVM, Clang tooling, MLIR).
- Demonstrated interest in developer tools, library design, and making other developers faster.
- If you care deeply about performance, enjoy working at the C++/Python boundary, and want to shape the core CUDA libraries relied on by thousands of developers, this role is a direct fit.
Sourced directly from NVIDIA’s career page
Your application goes straight to NVIDIA.
Opens nvidia.wd5.myworkdayjobs.com in a new tab
Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Germany-Remote/Senior-Software-Engineer--CUDA-Core-Libraries_JR2014754
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Samsung Semiconductor
Staff Technical Program Manager
San Jose, California, United States|Other
Samsung Semiconductor
Associate, Executive Administration
San Jose, California, United States|Other
Micron Technology
STAFF ENGINEER GFAC SASIA - ELECTRICAL
Fab 10A, Singapore|Other
Micron Technology
TEST HBM DATA ANALYST
Taichung - MTB, Taiwan|Other