Principal Engineer, MLE, SMAI

Opens micron.wd1.myworkdayjobs.com in a new tab

What You'll Do

  • , but not limited to: Architect and complete large-scale custom model training and fine-tuning jobs (SFT, RLHF) on multi-node, multi-GPU clusters.
  • Optimize training throughput and memory efficiency using distributed training strategies (FSDP, DeepSpeed, Megatron-LM) and mixed-precision techniques (FP16/BF16).
  • Build and develop autonomous AI Agents capable of multi-step reasoning, planning, and tool execution to automate complex manufacturing workflows.
  • Implement Agentic frameworks (e.g., LangChain, LangGraph, CrewAI) to orchestrate LLM interactions with internal APIs, databases, and software tools.
  • Profile and debug GPU performance bottlenecks using tools like Nsight Systems or PyTorch Profiler to improve hardware utilization.
  • Develop and sustain data/solution pipelines that support machine learning models and GenAI applications.
  • Build and optimize data structures in data management systems (Snowflake, and Google Cloud platforms) to enable AI/ML and Agentic solutions.
  • Build and maintain CI/CD pipelines of machine learning and AI Agent solutions in the cloud.

Requirements

  • 10+ years of experience with deep expertise in GPU architecture (memory hierarchy, tensor cores, NVLink) and GPU resource management across cloud and on‑prem environments. 5+ years in performance optimization, parallel computing, and low-level systems.
  • Strong C++ skills and experience with GPGPU frameworks.
  • CUDA is preferred, but HIP, OpenCL, or Metal are acceptable.
  • Hands-on experience building end-to-end ML systems, including distributed training techniques (DDP, FSDP, model parallelism) and automated pipelines for training, testing, and deployment.
  • Strong proficiency in LLMs, including timely engineering, fine-tuning (LoRA/QLoRA), inference optimization (vLLM, TensorRT-LLM), and development of GenAI applications/agents using LangChain, LlamaIndex, AutoGen, and PyTorch.
  • Proficient programming skills in Python (preferred) or Java are required, along with experience in CI/CD and cloud-native tools such as Git, Jenkins, Docker, and Kubernetes.
  • Candidates should have strong communication abilities and perform well in dynamic settings.
  • A Bachelor’s or Master’s degree or equivalent experience in Computer Science, Statistics, or a related field is expected.

Nice to Have

  • A Ph.D. in Computer Science or Statistics, or comparable experience, is highly desired.
  • Experience with HPC job schedulers (e.g., Slurm) and managing large scale GPU workloads on Kubernetes using tools like Ray and Kubeflow.
  • Knowledge of CUDA programming, Triton kernels, and building custom C++ extensions for PyTorch to accelerate workloads.
  • Experience crafting and orchestrating collaboration between specialized agents in multi agent architectures.
  • Deep knowledge of mathematics, probability, statistics, and algorithms.
  • Proven track record to evolve data science prototypes into production systems, with knowledge of computer vision and/or signal processing techniques for classification and feature extraction.
  • As a world leader in the semiconductor industry, Micron is dedicated to your personal wellbeing and professional growth.
  • Micron benefits are designed to help you stay well, provide peace of mind and help you prepare for the future.
  • We offer a choice of medical, dental and vision plans in all locations enabling team members to select the plans that best meet their family healthcare needs and budget.
  • Micron also provides benefit programs that help protect your income if you are unable to work due to illness or injury, and paid family leave.
  • Additionally, Micron benefits include a robust paid time-off program and paid holidays.
  • For additional information regarding the Benefit programs available, please see the Benefits Guide posted on micron.com/careers/benefits .
  • Micron is proud to be an equal opportunity workplace and is an affirmative action employer.
  • To learn about your right to work click here.
  • To learn more about Micron, please visit micron.com/careers For US Sites Only: To request assistance with the application process and/or for reasonable accommodations, please contact Micron’s People Organization at hrsupport_na@micron.com or 1-800-336-8918 (select option #3) Micron Prohibits the use of child labor and complies with all applicable laws, rules, regulations, and other international and industry labor standards.
  • Micron does not charge candidates any recruitment fees or unlawfully collect any other payment from candidates as consideration for their employment with Micron.
  • AI alert : Candidates are encouraged to use AI tools to enhance their resume and/or application materials.
  • However, all information provided must be accurate and reflect the candidate's true skills and experiences.
  • Misuse of AI to fabricate or misrepresent qualifications will result in immediate disqualification.
  • Fraud alert: Micron advises job seekers to be cautious of unsolicited job offers and to verify the authenticity of any communication claiming to be from Micron by checking the official Micron careers website in the About Micron Technology, Inc.

Sourced directly from Micron Technology’s career page

Your application goes straight to Micron Technology.

Micron Technology logo

Micron Technology

Boise, ID - Main Site

Specialisation
Open roles at Micron Technology
2973 positions
Job ID
/job/Boise-ID---Main-Site/Principal-Engineer--MLE--SMAI_JR98099-1

Get matched to roles like this

Upload your resume once. We’ll notify you when matching roles open up.

Join talent pool — free

Similar Other roles