Enable job alerts via email!

Senior AI Infrastructure Engineer - DGX Cloud

Nvidia Corporation in

Santa Clara (CA)

On-site

USD 144,000 - 271,000

Full time

11 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Senior AI Infrastructure Engineer, where you'll design and deploy cloud-based tooling to enhance operational excellence. This role offers the chance to work on innovative projects that streamline incident management and improve efficiency across teams. With a focus on building data pipelines and integrating tools, you'll play a key role in supporting executive leadership. This established industry player values creativity and autonomy, making it an exciting opportunity for those passionate about AI and cloud technologies.

Benefits

Equity options

Health insurance

Flexible working hours

Professional development

Diversity and inclusion programs

Qualifications

5+ years of experience in systems and software engineering.
Experience with infrastructure automation and distributed systems.

Responsibilities

Design and operate internal tooling based on cloud infrastructure.
Develop and maintain data pipelines for executive decision-making.

Skills

Python

Typescript

C/C++

Java

Linux

Networking

Storage

Containers

Infrastructure Automation

Education

BSc in Computer Science

Tools

Kubernetes

Terraform

Docker

Helm

Hive

Apache Beam

Spark

Looker

Tableau

FireHydrant

Senior AI Infrastructure Engineer - DGX Cloud (Finance)

DGXC SRE at NVIDIA ensures that our internal and external facing GPU cloud services run with maximum reliability and uptime, as promised to users. We enable developers to make changes to the existing system through careful preparation and planning, while monitoring capacity, latency, and performance.

We are seeking systems and software engineers interested in building tooling, reporting, automation, and ML solutions to enable operational excellence across a dynamic organization, solving technical problems that improve operational efficiency across multiple teams.

What you'll be doing:

Design, build, deploy, and operate internal tooling based on cloud infrastructure to support operational excellence.
Develop, implement, and maintain data pipelines used by executive leadership for decision-making.
Integrate tooling with internal and customer workflows, as well as cloud service providers, to streamline incident management processes.
Reduce operational toil related to incident handling, postmortems, and on-call tasks.
Promote sustainable, blameless incident prevention and response practices.
Provide operational best practices consultation to peer teams.

What we need to see:

BSc in Computer Science or a related technical field involving coding, or equivalent experience.
5+ years of relevant experience.
A proven track record of initiating projects, collaborating effectively, and contributing to team projects.
Experience with infrastructure automation and designing distributed systems for large-scale cloud environments.
Proficiency in one or more of the following: Python, Go, Typescript, C/C++, Java.
Deep knowledge of Linux, Networking, Storage, or Containers.

Ways to stand out:

Experience with incident tooling such as FireHydrant, Rootly, incident.io, or blameless, including plugin and schema development in Backstage.
Background in infrastructure technologies like Kubernetes, Terraform, Docker, Helm, and basic ML/data science tools like Hive, Apache Beam, Spark.
Experience with business analytics tools such as Looker or Tableau, and a systematic approach to problem-solving, communication, ownership, and initiative.

NVIDIA is recognized as one of the most desirable employers in the tech industry, known for innovative work in AI, HPC, and Visualization. We invite creative, autonomous, and motivated individuals to join us. Our inventions, including the GPU, are central to modern computing and our products.

The salary range is $144,000 - $270,250, determined by location, experience, and comparable roles. Additional benefits and equity are offered. Applications are accepted on an ongoing basis.

NVIDIA is committed to diversity and equal opportunity, welcoming applicants regardless of race, religion, gender, age, or other protected characteristics.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs