Senior Quantization Engineer - Edge AI

Hyderabad Other

Opens nxp.wd3.myworkdayjobs.com in a new tab

What You'll Do

in everything we do, where every point of view is valued.
Join us! Job Summary We are seeking a highly skilled Edge AI Engineer/Scientist with a strong theoretical foundation in AI and solid software engineering expertise to contribute to our Edge AI Model Optimization program.
While the primary focus of this role is on model quantization, the scope also includes complementary optimization strategies such as speculative decoding, pruning, and other methods for ensuring highly efficient on-device deployment.
You will work at the forefront of innovation, bridging the gap between research and practice, focusing on CNNs, Large Language Model (LLM) and Vision Language Model (VLM) optimization to bring advanced capabilities to NXP’s Ara2 family of NPUs, directly supporting the future of on‑device intelligence.
If you want to the future of efficient on‑device AI, this is the place to be.
Job Responsibilities Research: Actively survey the latest research (NeurIPS, ICLR, CVPR) on model optimization/compression, focusing particularly on neural network quantization, but also including other techniques like speculative decoding, pruning, etc.
Prototyping: Develop and adapt state-of-the-art methods to NXP’s hardware constraints, building POCs to showcase the effectiveness of these techniques on NXP HW.
Production Implementation: Translate research prototypes into robust, optimized production code (C++/Python), ensuring strict memory and compute efficiency standards.
Systems Integration: Document algorithmic tradeoffs, derive deployment recipes, and mentor the engineering team on numerical methods and optimization.
Cross-Functional Leadership: Act as the technical bridge between AI Research, Hardware Engineering and other teams, providing quantified guidance on how choices impact model accuracy and performance.
IP Generation: Contribute to NXP’s intellectual property portfolio through patents and technical publications.
Job Qualifications Required Background Education: MSc or Ph.D (is a plus) in Computer Science, Electrical Engineering, or Mathematics with a focus on Machine Learning or Deep Learning.
AI Expertise: Proven practical experience in AI/ML with a deep understanding of CNN architectures and Generative AI (Transformers, LLMs, VLMs, etc.).
Technical Stack: Strong hands-on experience with PyTorch, ONNX, and model conversion/optimization pipelines.
Software Engineering: Proficient in Python and C++ and best development practices.
Embedded Mindset: Familiarity with the constraints of embedded systems (latency, power, memory bandwidth) and how code interacts with underlying hardware.
Preferred Advanced AI: Experience with state-of-the-art quantization techniques for discriminative and generative AI (e.g., GPTQ, SpinQuant, etc).
Hardware Acceleration: Experience with NPUs, device-level profiling, and diagnosing memory bottlenecks.
Kernel Development: Experience with custom kernel development is a plus.
Compilers: Knowledge of MLIR or TVM is a significant plus.
We at NXP have an environment that fosters innovation.
Our team has technology experts who understand the big picture and mentors who coach passionate professionals to work on the most exciting challenges.
We share responsibilities in everything we do, where every point of view is valued.
Join us! Job Summary We are seeking a highly skilled Edge AI Engineer/Scientist with a strong theoretical foundation in AI and solid software engineering expertise to contribute to our Edge AI Model Optimization program.
While the primary focus of this role is on model quantization, the scope also includes complementary optimization strategies such as speculative decoding, pruning, and other methods for ensuring highly efficient on-device deployment.
You will work at the forefront of innovation, bridging the gap between research and practice, focusing on CNNs, Large Language Model (LLM) and Vision Language Model (VLM) optimization to bring advanced capabilities to NXP’s Ara2 family of NPUs, directly supporting the future of on‑device intelligence.
If you want to the future of efficient on‑device AI, this is the place to be.
Job Responsibilities Research: Actively survey the latest research (NeurIPS, ICLR, CVPR) on model optimization/compression, focusing particularly on neural network quantization, but also including other techniques like speculative decoding, pruning, etc.
Prototyping: Develop and adapt state-of-the-art methods to NXP’s hardware constraints, building POCs to showcase the effectiveness of these techniques on NXP HW.
Production Implementation: Translate research prototypes into robust, optimized production code (C++/Python), ensuring strict memory and compute efficiency standards.
Systems Integration: Document algorithmic tradeoffs, derive deployment recipes, and mentor the engineering team on numerical methods and optimization.
Cross-Functional Leadership: Act as the technical bridge between AI Research, Hardware Engineering and other teams, providing quantified guidance on how choices impact model accuracy and performance.
IP Generation: Contribute to NXP’s intellectual property portfolio through patents and technical publications.
Job Qualifications Required Background Education: MSc or Ph.D (is a plus) in Computer Science, Electrical Engineering, or Mathematics with a focus on Machine Learning or Deep Learning.
AI Expertise: Proven practical experience in AI/ML with a deep understanding of CNN architectures and Generative AI (Transformers, LLMs, VLMs, etc.).
Technical Stack: Strong hands-on experience with PyTorch, ONNX, and model conversion/optimization pipelines.
Software Engineering: Proficient in Python and C++ and best development practices.
Embedded Mindset: Familiarity with the constraints of embedded systems (latency, power, memory bandwidth) and how code interacts with underlying hardware.
Preferred Advanced AI: Experience with state-of-the-art quantization techniques for discriminative and generative AI (e.g., GPTQ, SpinQuant, etc).
Hardware Acceleration: Experience with NPUs, device-level profiling, and diagnosing memory bottlenecks.
Kernel Development: Experience with custom kernel development is a plus.
Compilers: Knowledge of MLIR or TVM is a significant plus.
More information about NXP in India... #LI-29f4

Tools & Skills

Languages

Python C++

Sourced directly from NXP Semiconductors’s career page

Your application goes straight to NXP Semiconductors.

Similar Other roles

Micron Technology

Senior Quantization Engineer - Edge AI

What You'll Do

Tools & Skills

More from NXP Semiconductors (732 roles)

Global Account Manager – Industrial Strategic Accounts

Semiconductor Package Competitive Analysis Intern

Packaging Development Engineer

NXP Semiconductors

Get matched to roles like this

Similar Other roles

Principal Engineer, Global Environmental Compliance - Design

TD MDE Senior Manager

Sr. Engineer, NAND Cell Integration

Product Engineer - NAND Engineering Tool Development