Enable job alerts via email!

Research Engineer, Post Training RL

TensorStax

San Francisco (CA)

On-site

USD 100,000 - 300,000

Full time

9 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading AI software company is seeking a Research Engineer specializing in Reinforcement Learning to optimize data engineering tasks with cutting-edge AI techniques. The role involves developing reward functions, fine-tuning models, and conducting research to advance the capabilities of autonomous systems. Ideal candidates will possess a strong understanding of reinforcement learning and the ability to work independently. With competitive compensation and comprehensive benefits like full health coverage and 401(k) match, this position offers a dynamic opportunity in a growing field.

Benefits

100% employer-covered health, dental, and vision insurance
401(k) with company match
Access to Bay Club or Equinox

Qualifications

  • Deep understanding of reinforcement learning and optimization strategies.
  • Experience with LLM fine-tuning techniques (PPO, DPO, KTO).
  • Strong problem-solving skills and experience with complex ML projects.

Responsibilities

  • Develop and refine reward functions for optimizing agent behavior.
  • Fine-tune language models using reinforcement learning techniques.
  • Curate and build datasets for supervised fine-tuning and RLHF.

Skills

Reinforcement Learning
Problem Solving
Data Engineering

Tools

PyTorch
Data Lakes
Data Warehouses

Job description

Direct message the job poster from TensorStax

co-founder @ tensorstax - making data engineering easier with AI

Research Engineer – Post Training Reinforcement Learning

About TensorStax

TensorStax is building fully autonomous AI systems to manage and maintain mission-critical data infrastructure and pipelines. We leverage reinforcement learning to enhance language models' ability to reason over large-scale data lakes and warehouses, detect pipeline failures, construct new pipelines with high precision, and enable agentic behavior—allowing systems to proactively identify and resolve issues autonomously.

What You’ll Do

As a Research Engineer specializing in Reinforcement Learning, you will:

  • Develop and refine reward functions to optimize agent behavior for complex data engineering tasks.
  • Fine-tune language models using reinforcement learning techniques such as PPO, DPO, and KTO.
  • Stay at the forefront of research on RL for language models, incorporating advancements like GRPO, SWE-Gym, and SWE-RL into practical applications.
  • Curate and build high-quality datasets for supervised fine-tuning (SFT) and RLHF.
  • Design experiments to evaluate and improve the agentic capabilities of language models in data environments.

What We’re Looking For

  • Deep understanding of reinforcement learning, reward shaping, and optimization strategies.
  • Strong familiarity with LLM fine-tuning techniques (PPO, DPO, KTO) and their applications in reinforcement learning.
  • Knowledge of recent advancements in RL for language models (GRPO, SWE-Gym, SWE-RL).
  • Experience curating and constructing high-quality datasets for fine-tuning.
  • Strong problem-solving skills and a history of working on complex ML projects.
  • High agency—ability to work independently, experiment proactively, and drive research initiatives forward.

Bonus Points

  • Experience with distributed training in PyTorch (DDP, FSDP).
  • Hands-on experience designing RL environments for traditional RL problems.
  • Contributions to open-source projects in RL, LLMs, or ML infrastructure.
  • Familiarity with data lakes and warehouses (Snowflake, BigQuery, Redshift).
  • 100% employer-covered health, dental, and vision insurance.
  • 401(k) with company match.
  • Access to Bay Club or Equinox in San Francisco.
Seniority level
  • Seniority level
    Entry level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Engineering and Information Technology
  • Industries
    Software Development

Referrals increase your chances of interviewing at TensorStax by 2x

Sign in to set job alerts for “Research Engineer” roles.

Sunnyvale, CA $85.10-$251,000.00 4 days ago

Milpitas, CA $115,900.00-$197,000.00 1 week ago

Research Engineer / Research Scientist - Deep Research
R&D Engineer, Detector Design, Model, and Analysis

San Francisco, CA $100,000.00-$300,000.00 2 weeks ago

Research Scientist - Reinforcement Learning for End-to-End Autonomous Systems

San Francisco, CA $100,000.00-$300,000.00 2 weeks ago

Sunnyvale, CA $177,000.00-$251,000.00 4 days ago

Research Engineer / Research Scientist, Post-Training

San Francisco Bay Area $200.00-$450.00 2 weeks ago

Sunnyvale, CA $213,000.00-$293,000.00 4 days ago

Research Scientist- Vision-Language-Action Models for Autonomous Systems

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

AI/ML Research Engineer - LLM Post-Training with RL

P-1 AI

San Francisco

On-site

USD 120,000 - 160,000

22 days ago