Enable job alerts via email!

AI Training Infrastructure Engineer - Post Training

ZipRecruiter

San Francisco (CA)

On-site

USD 220,000 - 290,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in AI technology is seeking experienced AI Research Engineers and Scientists to enhance their in-house Online LLMs. The role focuses on building robust training frameworks and collaborating with engineering teams to integrate advanced models into products. Ideal candidates will have significant experience in LLM frameworks and a strong programming background, particularly in Python and PyTorch. Competitive compensation and comprehensive benefits are offered.

Benefits

Comprehensive health insurance
Dental insurance
Vision insurance
401(k) plan
Equity options

Qualifications

  • Minimum of 6 years of relevant project experience.

Responsibilities

  • Build a post-training framework for large-scale model training jobs.
  • Implement infrastructure for the latest models and algorithms.
  • Manage data, training, and evaluation pipelines for LLM models.

Skills

Python
PyTorch
C++
CUDA

Education

PhD in AI/ML/Systems

Job description

Job Description

Perplexity is seeking experienced AI Research Engineers and Scientists to continue to improve our in-house Online LLMs, the Sonar models. Your role involves working with the team to create a robust and effective training framework (based on Megatron/PyTorch), especially for post-training LLMs.

Responsibilities

  1. Build a post-training framework capable of running large-scale model training jobs.
  2. Implement infrastructure and components to support the latest models and algorithms like SFT, RL (DPO/GRPO), and more.
  3. Manage the full stack data, training, and evaluation pipelines necessary for post-training LLM models.
  4. Collaborate closely with engineering teams to integrate Sonar models into our products.

Qualifications

  • Proven experience in building large-scale LLM frameworks.
  • Strong proficiency in Python and PyTorch; C++/CUDA skills are a plus.
  • Self-motivated with a willingness to take ownership of tasks.
  • Passion for solving challenging problems.
  • Minimum of 6 years of relevant project experience.

Bonus

  • PhD in AI/ML/Systems or related fields.
  • Experience in building LLM training frameworks, especially post-training.

The cash compensation range for this role is $220,000 - $290,000.

At Perplexity, we've experienced tremendous growth and adoption since launching the world's first fully functional conversational answer engine in 2022. Our daily question answering has increased from 2.5 million to around 20 million queries by December 2024. We also offer Perplexity Enterprise Pro, serving clients like Nvidia, the Cleveland Cavaliers, Bridgewater, and Zoom.

To support our rapid expansion, we've secured significant funding from top investors including IVP, NEA, Jeff Bezos, NVIDIA, Databricks, Bessemer Venture Partners, Elad Gil, Nat Friedman, Daniel Gross, Naval Ravikant, Tobi Lutke, among others. In 2024, our employee count grew by nearly 300%, and we're just getting started.

Final offer amounts depend on experience and expertise and may differ from the listed range.

Equity: In addition to base salary, equity may be part of the total compensation package.

Benefits: Comprehensive health, dental, and vision insurance for you and your dependents, including a 401(k) plan.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

AI Training Infrastructure Engineer - Post Training

Tbwa Chiat/Day Inc

San Francisco

Hybrid

USD 220.000 - 290.000

30+ days ago

AI Infrastructure Engineer, ML Data Platform

Scale AI, Inc.

California

On-site

USD 188.000 - 226.000

10 days ago

Software Engineer, Training Infrastructure

Google

Mountain View

Remote

USD 189.000 - 350.000

30+ days ago