Enable job alerts via email!

Software Engineer, Training Infrastructure Engineering Mountain View, California, US

DeepMind Technologies Limited

Canada

Hybrid

USD 189,000 - 350,000

Full time

30 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player seeks a skilled Software Engineer to join a dynamic team focused on advancing artificial intelligence. In this individual contributor role, you will tackle critical research challenges, enhancing mid and post-training workloads while mentoring junior team members. Your expertise in building large-scale infrastructure for research, coupled with a strong foundation in software development and distributed systems, will be pivotal in driving innovation. This role offers the opportunity to work collaboratively across a diverse team, contributing to groundbreaking research and making a significant impact in the field of AI.

Qualifications

8+ years of software development experience with data structures and algorithms.
Experience building large scale infrastructure for research in Deep Learning.

Responsibilities

Translate research requirements into technical roadmaps with team collaboration.
Lead implementation and documentation of research infrastructure.

Skills

Software Development

Data Structures

Algorithms

Technical Writing

Reinforcement Learning

Performance Profiling

Distributed Systems

Education

Bachelor's degree or equivalent practical experience

Tools

Jax

XLA stack

Software Engineer, Training Infrastructure

Mountain View, California, US

Snapshot

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

You will join a ~10 size research engineering team working in Gemini. The team embeds in high priority / strategic research efforts, accelerating experimental iteration by improving the quality and capability of the tools and technology available to build large scale training systems. The team's core expertise is in reinforcement learning infrastructure and methods, distributed systems and accelerators.

The team is distributed across US, Canada, France and UK, and operates very collaboratively supporting efforts across Google DeepMind.

The Role

This is an individual contributor position in which you will collaborate with the other engineers in the team on unblocking progress in critical research challenges, from the scoping of technical roadmaps, and the design and implementation of new infrastructure, to the design, execution and analysis of experiments.

As an experienced Software Engineer you will naturally gravitate around infra-heavy tasks, take responsibility for improving performance and efficiency of mid and post-training workloads, and be a role model for more junior team members.

You will align to Gemini priorities, flexibly ramping up new research problem spaces and effectively working with a broad range of collaborators across the organization. You will work with the team TL to steer the team's direction, and select new efforts to engage with.

Key Responsibilities

Translate research requirements into technical roadmaps in collaboration with the other team members
Execute and lead on the implementation and documentation of research infra
Learn about the research problem space the team works in, upskill and be able to contribute to the efforts research agenda
Support growth of more junior team members
Add to the team culture, and be a role model of sustainability and excellence

About You

Bachelor's degree or equivalent practical experience.
8 years of experience in software development, and with data structures/algorithms.
The ideal candidate will have 5 years of experience building, testing, and supporting software in research.
Proven track record of building large scale infra for research in Deep Learning, with profound understanding of:
Accelerators (e.g. Jax & XLA stack) & performance profiling and optimization
Analysis and debugging of training behavior
Distributed systems, resilience and performance

Experience with Reinforcement Learning a plus.

You communicate clearly both verbally and in writing, and are comfortable with working in a team distributed across time-zones.
You are a good technical writer, and produce clear and succinct design docs.
You contribute constructively to an asynchronous design process.
You can produce impactful work quickly: you are equally at ease with producing library-quality code as well as whipping out prototypes to unblock quick iteration of research ideas.
You are comfortable moving around projects, supporting team members as required, quickly ramping up on new problems, and working with a broad and diverse set of collaborators across engineering and research.

The US base salary range for this full-time position is between $189,000 - $350,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

Apply for this job

* indicates a required field

First Name *

Last Name *

Email *

Phone

Resume/CV *

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Link to external profile e.g. LinkedIn, GitHub etc.

Where did you hear about this role? * Select...

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.