Enable job alerts via email!

Research Engineer, Post Training RL

TensorStax

San Francisco (CA)

On-site

USD 100,000 - 300,000

Full time

9 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading AI software company is seeking a Research Engineer specializing in Reinforcement Learning to optimize data engineering tasks with cutting-edge AI techniques. The role involves developing reward functions, fine-tuning models, and conducting research to advance the capabilities of autonomous systems. Ideal candidates will possess a strong understanding of reinforcement learning and the ability to work independently. With competitive compensation and comprehensive benefits like full health coverage and 401(k) match, this position offers a dynamic opportunity in a growing field.

Benefits

100% employer-covered health, dental, and vision insurance

401(k) with company match

Access to Bay Club or Equinox

Qualifications

Deep understanding of reinforcement learning and optimization strategies.
Experience with LLM fine-tuning techniques (PPO, DPO, KTO).
Strong problem-solving skills and experience with complex ML projects.

Responsibilities

Develop and refine reward functions for optimizing agent behavior.
Fine-tune language models using reinforcement learning techniques.
Curate and build datasets for supervised fine-tuning and RLHF.

Skills

Reinforcement Learning

Problem Solving

Data Engineering

Tools

PyTorch

Data Lakes

Data Warehouses

Direct message the job poster from TensorStax

co-founder @ tensorstax - making data engineering easier with AI

Research Engineer – Post Training Reinforcement Learning

About TensorStax

TensorStax is building fully autonomous AI systems to manage and maintain mission-critical data infrastructure and pipelines. We leverage reinforcement learning to enhance language models' ability to reason over large-scale data lakes and warehouses, detect pipeline failures, construct new pipelines with high precision, and enable agentic behavior—allowing systems to proactively identify and resolve issues autonomously.

What You’ll Do

As a Research Engineer specializing in Reinforcement Learning, you will:

Develop and refine reward functions to optimize agent behavior for complex data engineering tasks.
Fine-tune language models using reinforcement learning techniques such as PPO, DPO, and KTO.
Stay at the forefront of research on RL for language models, incorporating advancements like GRPO, SWE-Gym, and SWE-RL into practical applications.
Curate and build high-quality datasets for supervised fine-tuning (SFT) and RLHF.
Design experiments to evaluate and improve the agentic capabilities of language models in data environments.

What We’re Looking For

Deep understanding of reinforcement learning, reward shaping, and optimization strategies.
Strong familiarity with LLM fine-tuning techniques (PPO, DPO, KTO) and their applications in reinforcement learning.
Knowledge of recent advancements in RL for language models (GRPO, SWE-Gym, SWE-RL).
Experience curating and constructing high-quality datasets for fine-tuning.
Strong problem-solving skills and a history of working on complex ML projects.
High agency—ability to work independently, experiment proactively, and drive research initiatives forward.

Bonus Points

Experience with distributed training in PyTorch (DDP, FSDP).
Hands-on experience designing RL environments for traditional RL problems.
Contributions to open-source projects in RL, LLMs, or ML infrastructure.
Familiarity with data lakes and warehouses (Snowflake, BigQuery, Redshift).
100% employer-covered health, dental, and vision insurance.
401(k) with company match.
Access to Bay Club or Equinox in San Francisco.

Seniority level

Seniority level
Entry level

Employment type

Employment type
Full-time

Job function

Job function
Engineering and Information Technology
Industries
Software Development

Referrals increase your chances of interviewing at TensorStax by 2x

Sunnyvale, CA $85.10-$251,000.00 4 days ago

Milpitas, CA $115,900.00-$197,000.00 1 week ago

Research Engineer / Research Scientist - Deep Research

R&D Engineer, Detector Design, Model, and Analysis

San Francisco, CA $100,000.00-$300,000.00 2 weeks ago

Research Scientist - Reinforcement Learning for End-to-End Autonomous Systems

San Francisco, CA $100,000.00-$300,000.00 2 weeks ago

Sunnyvale, CA $177,000.00-$251,000.00 4 days ago

Research Engineer / Research Scientist, Post-Training

San Francisco Bay Area $200.00-$450.00 2 weeks ago

Sunnyvale, CA $213,000.00-$293,000.00 4 days ago

Research Scientist- Vision-Language-Action Models for Autonomous Systems

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

AI/ML Research Engineer - LLM Post-Training with RL

P-1 AI

San Francisco

On-site

USD 120,000 - 160,000

22 days ago

Research Engineer, Post Training RL

TensorStax

San Francisco (CA)

On-site

USD 100,000 - 300,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Tools

Job description

Similar jobs

AI/ML Research Engineer - LLM Post-Training with RL

San Francisco

On-site

USD 120,000 - 160,000