Senior Software Engineer, AI Inference

Opens nvidia.wd5.myworkdayjobs.com in a new tab

Benefits

  • both your customers and the broader community.
  • Build internal tools, benchmarking harnesses, and automation pipelines that raise the productivity of your teammates and customers alike — with a multiplier attitude that makes everyone around you more effective.
  • Document architectures, findings, and recommendations with clarity for technical audiences, and contribute improvements back to vLLM and related open-source projects where appropriate.
  • What We Need to See: Bachelor's, Master's, or PhD in Computer Science, Computer Engineering, or equivalent experience. 5+ years of industry experience building and operating complex, production-grade software systems, with strong instincts for how systems behave at scale.
  • Hands-on experience deploying and operating LLM inference workloads — particularly with vLLM — including configuration, optimization, and debugging in real-world environments.
  • Proficiency with container orchestration (Kubernetes) and HPC scheduling (Slurm) for running GPU-accelerated workloads.
  • Solid understanding of LLM serving fundamentals: batching strategies (continuous batching, chunked prefill), KV cache management, and tensor/pipeline parallelism.
  • Familiarity with GPU performance analysis: memory hierarchy, utilization, roofline modeling, and profiling with Nsight Systems or Nsight Compute.
  • Strong written and verbal communication skills, with the ability to present technical findings clearly to both engineering teams and leadership — and to navigate ambiguous, open-ended customer problems.
  • Ways to Stand Out from the Crowd: Experience with NVIDIA Dynamo or other disaggregated inference serving frameworks.
  • Contributions to open-source inference or ML systems projects, particularly vLLM or SGLang — please include links to relevant pull requests or artifacts.
  • Background with ML compilers or GPU kernel development (Triton, CUTLASS, TorchInductor).
  • Experience building developer tools or internal platforms that meaningfully improved team productivity.
  • Prior experience in a customer-facing or forward-deployed engineering capacity within a technical product organization.
  • Widely considered to be one of the technology world's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package.
  • As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/ #LI-Hybrid Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
  • The base salary range is 135,000 CAD - 185,000 CAD for Level 3, and 170,000 CAD - 220,000 CAD for Level 4.
  • You will also be eligible for equity and benefits .
  • Applications for this job will be accepted at least until April 14, 2026.
  • This posting is for an existing vacancy.
  • NVIDIA uses AI tools in its recruiting processes.

Sourced directly from NVIDIA’s career page

Your application goes straight to NVIDIA.

NVIDIA logo

NVIDIA

Canada, Toronto

Specialisation
Open roles at NVIDIA
2000 positions
Job ID
/job/Canada-Toronto/Senior-Software-Engineer--AI-Inference_JR2016014

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles