Enable job alerts via email!

Senior AI Engineer, NeMo Retriever - Model Optimization and MLOps

NVIDIA

Santa Clara (CA)

On-site

USD 184,000 - 357,000

Full time

2 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative company at the forefront of the AI revolution is seeking an AI Engineer to join their dynamic team. This role involves developing and optimizing models for various AI applications, utilizing cutting-edge technologies and frameworks. You will collaborate with cross-functional teams to create integrated AI solutions, focusing on performance, scalability, and correctness. The ideal candidate will have extensive experience in machine learning, a passion for problem-solving, and a strong desire for continuous learning. Join a forward-thinking organization that values diversity and fosters an inclusive environment, offering competitive salaries and comprehensive perks.

Benefits

Equity

Comprehensive Benefits

Flexible Work Hours

Professional Development Opportunities

Qualifications

8+ years of experience in AI engineering or related roles.
Proficiency in Python and deep learning frameworks like PyTorch.

Responsibilities

Develop and maintain NIMs that containerize optimized models using OpenAPI standards.
Contribute to building and maintaining our Continuous Delivery pipeline.

Skills

Python

Deep Learning (PyTorch)

MLOps Tools (Docker, Kubernetes, Helm)

NLP

Generative AI

Cloud-based Software Delivery

Education

Bachelor's Degree in Computer Science

Master's Degree in Engineering

Tools

Docker

Kubernetes

Helm

TensorRT

TensorRT-LLM

Job Description: AI Engineer at NVIDIA

NVIDIA is at the forefront of the AI revolution, powering innovations from self-driving cars and robotics to intelligent assistants and information retrieval. We develop NVIDIA NIM, a platform providing containers for GPU-accelerated inferencing microservices supporting various AI models across multiple environments. These microservices utilize industry-standard APIs for seamless integration into AI applications and workflows, optimized with NVIDIA's TensorRT and community-driven engines like TensorRT-LLM.

Our NeMo Retriever team focuses on building multimodal extraction, re-ranking, and embedding pipelines that deliver high accuracy and data privacy for AI applications such as retrieval-augmented generation (RAG) and Agentic AI workflows. We seek an AI Engineer passionate about machine learning development, system optimization, and MLOps, eager to tackle complex problems in Generative AI, LLM, MLLM, and RAG using our hardware and software platforms.

What You'll Be Doing:

Develop and maintain NIMs that containerize optimized models using OpenAPI standards with Python or similar performant languages.
Collaborate with partner teams to gather requirements, build & evaluate POCs, and develop roadmaps for production tools.
Enable the development of integrated AI Blueprints providing a unified, turnkey experience.
Contribute to building and maintaining our Continuous Delivery pipeline to ensure faster and safer deployment of changes.
Conduct peer reviews focusing on performance, scalability, and correctness.

What We Need To See:

Bachelor’s or Master’s Degree in Computer Science, Engineering, or related fields (or equivalent experience).
8+ years of experience in a similar or related role.
Proficiency in Python and Deep Learning frameworks like PyTorch.
Experience with cloud-based software delivery and infrastructure management.
Knowledge of MLOps tools such as Docker, Kubernetes, Helm, and data center deployments.
Familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
Deep understanding of NLP, LLM, MLLM, Generative AI, and RAG workflows.
Self-motivated with a passion for growth, continuous learning, and knowledge sharing.
Highly curious, enthusiastic about new technologies, and motivated to solve complex problems.

We offer competitive salaries and benefits, with a base salary range of $184,000 - $356,500, determined by location, experience, and internal pay scales. Additional benefits include equity and comprehensive perks. NVIDIA values diversity and is an equal opportunity employer, committed to fostering an inclusive environment free from discrimination.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs