Enable job alerts via email!

Senior AI Engineer, NeMo Retriever - Model Optimization and MLOps

NVIDIA

United States

Remote

USD 184,000 - 357,000

Full time

Yesterday

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company at the forefront of AI technology as an AI Engineer. In this dynamic role, you will leverage your expertise in Python and Deep Learning to develop innovative solutions that power applications from self-driving cars to intelligent assistants. Collaborate with cross-functional teams to create optimized models and maintain a robust Continuous Delivery pipeline. This position offers a competitive salary and a generous benefits package in a vibrant work environment that encourages creativity and innovation. If you're passionate about solving complex problems in Generative AI and MLOps, this opportunity is perfect for you.

Benefits

Generous benefits package

Dynamic work environment

Competitive salaries

Qualifications

8+ years of experience in AI and machine learning development.
Expertise in Python and Deep Learning frameworks like PyTorch.

Responsibilities

Develop and maintain NIMs that containerize optimized models using OpenAPI standards.
Collaborate with partner teams to build POCs and develop roadmaps for production tools.

Skills

Python

Deep Learning (PyTorch)

MLOps

NLP

Generative AI

Education

Bachelor's Degree in Computer Science

Master's Degree in Engineering

Tools

Docker

Kubernetes

Helm

TensorRT

Job Description: AI Engineer at NVIDIA

NVIDIA's technology is at the heart of the AI revolution, powering applications from self-driving cars to intelligent assistants. We develop NVIDIA NIM, which provides containers for GPU-accelerated inferencing microservices for AI models across various platforms. Our NIM microservices expose industry-standard APIs for seamless integration into AI applications, leveraging optimized inference engines like NVIDIA TensorRT and TensorRT-LLM to enhance response latency and throughput.

About NeMo Retriever: NVIDIA NeMo Retriever is a collection of NIMs for building multimodal extraction, re-ranking, and embedding pipelines with high accuracy and data privacy. It supports AI applications like retrieval-augmented generation (RAG) and Agentic AI workflows.

Role Overview: We are seeking an AI Engineer to join our team, focusing on machine learning development, performance optimization, and MLOps. The ideal candidate will have a passion for solving complex problems in Generative AI, LLM, MLLM, and RAG spaces, working with our hardware and software platforms to build flexible, multi-modal retrievers and agents.

Responsibilities:

Develop and maintain NIMs that containerize optimized models using OpenAPI standards with Python or similar languages.
Collaborate with partner teams to understand requirements, build POCs, and develop roadmaps for production tools.
Enable the development of integrated AI Blueprints for a unified experience.
Maintain our Continuous Delivery pipeline to ensure faster, safer deployments.
Conduct peer reviews focusing on performance, scalability, and correctness.

Qualifications:

Bachelor’s or Master’s Degree in Computer Science, Engineering, or related field (or equivalent experience).
8+ years of relevant experience.
Expertise in Python and Deep Learning frameworks like PyTorch.
Experience with cloud-based software delivery and infrastructure patterns.
Knowledge of MLOps tools such as Docker, Kubernetes, Helm, and data center deployments.
Familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
Deep understanding of NLP, LLM, MLLM, Generative AI, and RAG workflows.
Self-motivated, eager to learn, and collaborative.
Passionate about emerging technologies and innovation.

We offer competitive salaries, a generous benefits package, and a dynamic work environment. The salary range is $184,000 - $356,500, determined by experience and location. NVIDIA is committed to diversity and equal opportunity in employment.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.