Opens nxp.wd3.myworkdayjobs.com in a new tab
What You'll Do
- in everything we do, where every point of view is valued.
- Join us! Job Summary We are seeking a highly skilled Edge AI Engineer/Scientist with a strong theoretical foundation in AI and solid software engineering expertise to contribute to our Edge AI Model Optimization program.
- While the primary focus of this role is on model quantization, the scope also includes complementary optimization strategies such as speculative decoding, pruning, and other methods for ensuring highly efficient on-device deployment.
- You will work at the forefront of innovation, bridging the gap between research and practice, focusing on CNNs, Large Language Model (LLM) and Vision Language Model (VLM) optimization to bring advanced capabilities to NXP’s Ara2 family of NPUs, directly supporting the future of on‑device intelligence.
- If you want to the future of efficient on‑device AI, this is the place to be.
- Job Responsibilities Research: Actively survey the latest research (NeurIPS, ICLR, CVPR) on model optimization/compression, focusing particularly on neural network quantization, but also including other techniques like speculative decoding, pruning, etc.
- Prototyping: Develop and adapt state-of-the-art methods to NXP’s hardware constraints, building POCs to showcase the effectiveness of these techniques on NXP HW.
- Production Implementation: Translate research prototypes into robust, optimized production code (C++/Python), ensuring strict memory and compute efficiency standards.
- Systems Integration: Document algorithmic tradeoffs, derive deployment recipes, and mentor the engineering team on numerical methods and optimization.
- Cross-Functional Leadership: Act as the technical bridge between AI Research, Hardware Engineering and other teams, providing quantified guidance on how choices impact model accuracy and performance.
- IP Generation: Contribute to NXP’s intellectual property portfolio through patents and technical publications.
- Job Qualifications Required Background Education: MSc or Ph.D (is a plus) in Computer Science, Electrical Engineering, or Mathematics with a focus on Machine Learning or Deep Learning.
- AI Expertise: Proven practical experience in AI/ML with a deep understanding of CNN architectures and Generative AI (Transformers, LLMs, VLMs, etc.).
- Technical Stack: Strong hands-on experience with PyTorch, ONNX, and model conversion/optimization pipelines.
- Software Engineering: Proficient in Python and C++ and best development practices.
- Embedded Mindset: Familiarity with the constraints of embedded systems (latency, power, memory bandwidth) and how code interacts with underlying hardware.
- Preferred Advanced AI: Experience with state-of-the-art quantization techniques for discriminative and generative AI (e.g., GPTQ, SpinQuant, etc).
- Hardware Acceleration: Experience with NPUs, device-level profiling, and diagnosing memory bottlenecks.
- Kernel Development: Experience with custom kernel development is a plus.
- Compilers: Knowledge of MLIR or TVM is a significant plus.
- We at NXP have an environment that fosters innovation.
- Our team has technology experts who understand the big picture and mentors who coach passionate professionals to work on the most exciting challenges.
- We share responsibilities in everything we do, where every point of view is valued.
- Join us! Job Summary We are seeking a highly skilled Edge AI Engineer/Scientist with a strong theoretical foundation in AI and solid software engineering expertise to contribute to our Edge AI Model Optimization program.
- While the primary focus of this role is on model quantization, the scope also includes complementary optimization strategies such as speculative decoding, pruning, and other methods for ensuring highly efficient on-device deployment.
- You will work at the forefront of innovation, bridging the gap between research and practice, focusing on CNNs, Large Language Model (LLM) and Vision Language Model (VLM) optimization to bring advanced capabilities to NXP’s Ara2 family of NPUs, directly supporting the future of on‑device intelligence.
- If you want to the future of efficient on‑device AI, this is the place to be.
- Job Responsibilities Research: Actively survey the latest research (NeurIPS, ICLR, CVPR) on model optimization/compression, focusing particularly on neural network quantization, but also including other techniques like speculative decoding, pruning, etc.
- Prototyping: Develop and adapt state-of-the-art methods to NXP’s hardware constraints, building POCs to showcase the effectiveness of these techniques on NXP HW.
- Production Implementation: Translate research prototypes into robust, optimized production code (C++/Python), ensuring strict memory and compute efficiency standards.
- Systems Integration: Document algorithmic tradeoffs, derive deployment recipes, and mentor the engineering team on numerical methods and optimization.
- Cross-Functional Leadership: Act as the technical bridge between AI Research, Hardware Engineering and other teams, providing quantified guidance on how choices impact model accuracy and performance.
- IP Generation: Contribute to NXP’s intellectual property portfolio through patents and technical publications.
- Job Qualifications Required Background Education: MSc or Ph.D (is a plus) in Computer Science, Electrical Engineering, or Mathematics with a focus on Machine Learning or Deep Learning.
- AI Expertise: Proven practical experience in AI/ML with a deep understanding of CNN architectures and Generative AI (Transformers, LLMs, VLMs, etc.).
- Technical Stack: Strong hands-on experience with PyTorch, ONNX, and model conversion/optimization pipelines.
- Software Engineering: Proficient in Python and C++ and best development practices.
- Embedded Mindset: Familiarity with the constraints of embedded systems (latency, power, memory bandwidth) and how code interacts with underlying hardware.
- Preferred Advanced AI: Experience with state-of-the-art quantization techniques for discriminative and generative AI (e.g., GPTQ, SpinQuant, etc).
- Hardware Acceleration: Experience with NPUs, device-level profiling, and diagnosing memory bottlenecks.
- Kernel Development: Experience with custom kernel development is a plus.
- Compilers: Knowledge of MLIR or TVM is a significant plus.
- More information about NXP in India... #LI-29f4
Sourced directly from NXP Semiconductors’s career page
Your application goes straight to NXP Semiconductors.
More from NXP Semiconductors (621 roles)
Opens nxp.wd3.myworkdayjobs.com in a new tab
Specialisation
Open roles at NXP Semiconductors
621 positions
Job ID
/job/Hyderabad/Senior-Quantization-Engineer---Edge-AI_R-10063467
Get matched to roles like this
Upload your resume once. We’ll notify you when matching roles open up.
Join talent pool — freeSimilar Other roles
Micron Technology
SR ENGINEER, FE GLOBAL MANUFACTURING ENGINEERING
2 Locations|Other
Micron Technology
Process Integration Engineer (BEOL)
Hiroshima - Fab 15, Japan|Other
Micron Technology
Technician - RDA Shift Process
Fab 10N/X, Singapore|Other
Micron Technology
F16N_HVM _ Production/ Equipment/ Process Engineer
Miaoli - Tongluo, Taiwan|Other