Enable job alerts via email!

Machine Learning Applied Scientist (Reinforcement Learning)

OfferFit

Remote (OR)

Remote

USD 120,000 - 180,000

Full time

Today
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in AI decisioning is seeking a Reinforcement Learning researcher to tackle real-world challenges in customer communication. This role involves improving algorithms, conducting research, and collaborating with engineering teams to enhance OfferFit's platform. The ideal candidate will have a Ph.D. in a relevant field and a passion for applying cutting-edge research to practical applications.

Benefits

Generous PTO (starting at 25 days)
100% remote work environment
Quarterly team gatherings
Weekly team events
Support for continued research
Competitive benefits including medical and 401K matching

Qualifications

  • Experience in developing and applying RL algorithms.
  • Strong coding skills with clean, well-documented code.
  • Ability to tackle complex challenges in real-world applications.

Responsibilities

  • Improve RL algorithms for performance and robustness.
  • Conduct research on state-of-the-art RL techniques.
  • Collaborate with engineering teams to enhance platform.

Skills

Reinforcement Learning
Python ML ecosystem
Problem solving
Collaboration
Clear communication

Education

Ph.D. in Computer Science, Machine Learning, or related field
MS with professional experience in RL

Tools

Spark
BigQuery
FastAPI

Job description

United States (Remote), Canada (Remote), LATAM (Remote), and Europe (Remote)

OfferFit was founded by ex-McKinsey and BCG math PhDs, and we’re funded by leading Silicon Valley VCs. OfferFit’s AI decisioning engine supports 1:1 personalization for lifecycle marketing campaigns, powered by reinforcement learning AI. This allows marketers to test & improve the performance of their campaigns much faster than before. Customers include leading brands like Brinks Home, Yelp, Chime, Engie, and MetLife, among many others.

Our team is growing! We’re looking for an RL researcher or practitioner who will apply reinforcement learning to solve real-world customer communication challenges. In this role, you'll work closely with the CTO and Engineering team to improve sample efficiency, test credit assignment and attribution algorithms, investigate and improve our approach to action space featurization, reward shaping, combining RL with constrained optimization, and other interesting challenges. Currently, we use ensembles of contextual bandits to achieve high sample efficiency and coordination of decisions, and we’re constantly testing and implementing improvements.

Your responsibilities will include:

  • Improve RL algorithms to increase performance, sample efficiency, and robustness at scale
  • Develop and apply advanced diagnostic tools, including off-policy evaluation methods
  • Conduct research on state-of-the-art RL techniques and their applicability to marketing optimization
  • Implement better monitoring and observability tooling
  • Work closely with engineering teams to improve OfferFit’s platform and develop APIs for OfferFit ML components
  • Participate in customer implementations to gain insights into real-world use cases
  • Contribute to OfferFit’s product strategy and roadmap
  • Data Science/Back End: Python ML ecosystem, Spark, BigQuery, FastAPI
  • We write well-tested, type-hinted, documented, modular code and use pre-commit hooks, CI/CD, and issue tracking for development

Why is it great:

No toy datasets in notebooks — we’re implementing AI pipelines in production at scale generating real value.

  • Opportunity to bridge cutting-edge RL research with real-world applications.
  • Access to large-scale datasets and computational resources.
  • Support for continued research, including conference attendance and publication.
  • Collaboration with a diverse team of experts in ML, engineering, and marketing.
  • Join OfferFit’s fast-paced, supportive, and professional team. We make sure all of our team members are empowered and receive great mentorship and coaching.

Who’s a Fit:

  • Exceptional coder: you have experience on writing clean, well-designed, versioned code; you care about good coding practices and terse, testable APIs.
  • Problem solver: you thrive on tackling complex, real-world challenges with novel ML approaches
  • Impact-driven: you're motivated by seeing your research translate into tangible business outcomes
  • Collaborative: You enjoy working closely with a team of driven individuals across multiple teams to get things done. You’re willing to both help and ask for help.
  • Structured and organized: you can structure a plan, align stakeholders, and see it through to execution.
  • Clear communicator: you are able to express yourself clearly and persuasively, both in writing and speech.
  • Ph.D. in Computer Science, Machine Learning, or a related field with a focus on Reinforcement Learning. MS with professional experience with RL is fine too.

Additional Requirements:

  • Candidates must be able to partially overlap and support North America time zones
  • Must be fluent in English, both written and verbal
  • Up to10-15% travel for company-wide quarterly gatherings, team offsite workshops, customer meetings, and industry-related events

The base salary range for this position in the United States is $120,000-$180,000 per year, plus eligibility for additional bonus ranging $15,000-$22,000; Eligibility for an additional end of year performance bonus, commissions (when applicable) and/or equity options may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, depending on the position offered. Please note that we adjust compensation for non–US countries using a relative cost of labor adjustment between the US and your country of residence. Applicants should apply via OfferFit’s internal or external careers site.

OfferFit Benefits and Perks:

  • Generous PTO (starting at 25 days PTO per year) and Parental Leave policy (12 weeks paid)
  • 100% remote work environment with flexible hours
  • Quarterly gatherings where we meet in person in a different city to work together, bond as a team and celebrate our progress
  • Weekly team events (lunch and learns, trivia, virtual escape rooms, town hall and team health “barometer” meetings)
  • Ability to learn and develop from an experienced leadership team (ex-Amazon, McKinsey, BCG, and IBM, among others) who are focused on building a talented, diverse, and inclusive team
  • Dedication to building a strong culture (e.g., team resource groups, weekly recognitions, major life event celebrations, mental health/sustainability days off, etc.)
  • [US Only] Competitive benefits (major medical, vision, dental and LTD) and 401K matching program

OfferFit is committed to a diverse and inclusive workplace. OfferFit is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Apply for this job

*

indicates a required field

First Name *

Last Name *

Email *

Phone

Resume/CV *

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

LinkedIn Profile

Website

The role requires Reinforcement Learning expertise. Can you confirm if you meet this qualification? * Select...

This role requires a Ph.D. in Computer Science, Machine Learning, or a related field with a focus on Reinforcement Learning. A MS and professional experience with Reinforcement Learning is also fine. Can you confirm you meet this qualification? * Select...

U.S. Standard Demographic Questions

We invite applicants to share their demographic background. If you choose to complete this survey, your responses may be used to identify areas of improvement in our hiring process.

How would you describe your gender identity? (mark all that apply) Select...

How would you describe your racial/ethnic background? (mark all that apply) Select...

How would you describe your sexual orientation? (mark all that apply) Select...

Do you identify as transgender? Select...

Do you have a disability or chronic condition (physical, visual, auditory, cognitive, mental, emotional, or other) that substantially limits one or more of your major life activities, including mobility, communication (seeing, hearing, speaking), and learning? Select...

Are you a veteran or active member of the United States Armed Forces? Select...

Voluntary Self-Identification

For government reporting purposes, we ask candidates to respond to the below self-identification survey.Completion of the form is entirely voluntary. Whatever your decision, it will not be considered in the hiringprocess or thereafter. Any information that you do provide will be recorded and maintained in aconfidential file.

As set forth in OfferFit’s Equal Employment Opportunity policy,we do not discriminate on the basis of any protected group status under any applicable law.

If you believe you belong to any of the categories of protected veterans listed below, please indicate by making the appropriate selection.As a government contractor subject to the Vietnam Era Veterans Readjustment Assistance Act (VEVRAA), we request this information in order to measurethe effectiveness of the outreach and positive recruitment efforts we undertake pursuant to VEVRAA. Classification of protected categoriesis as follows:

A "disabled veteran" is one of the following: a veteran of the U.S. military, ground, naval or air service who is entitled to compensation (or who but for the receipt of military retired pay would be entitled to compensation) under laws administered by the Secretary of Veterans Affairs; or a person who was discharged or released from active duty because of a service-connected disability.

A "recently separated veteran" means any veteran during the three-year period beginning on the date of such veteran's discharge or release from active duty in the U.S. military, ground, naval, or air service.

An "active duty wartime or campaign badge veteran" means a veteran who served on active duty in the U.S. military, ground, naval or air service during a war, or in a campaign or expedition for which a campaign badge has been authorized under the laws administered by the Department of Defense.

An "Armed forces service medal veteran" means a veteran who, while serving on active duty in the U.S. military, ground, naval or air service, participated in a United States military operation for which an Armed Forces service medal was awarded pursuant to Executive Order 12985.

Select...

Voluntary Self-Identification of Disability

Form CC-305

Page 1 of 1

OMB Control Number 1250-0005

Expires 04/30/2026

Voluntary Self-Identification of Disability
Form CC-305 Page 1 of 1 OMB Control Number 1250-0005 Expires 04/30/2026
Why are you being asked to complete this form?

We are a federal contractor or subcontractor. The law requires us to provide equal employment opportunity to qualified people with disabilities. We have a goal of having at least 7% of our workers as people with disabilities. The law says we must measure our progress towards this goal. To do this, we must ask applicants and employees if they have a disability or have ever had one. People can become disabled, so we need to ask this question at least every five years.

Completing this form is voluntary, and we hope that you will choose to do so. Your answer is confidential. No one who makes hiring decisions will see it. Your decision to complete the form and your answer will not harm you in any way. If you want to learn more about the law or this form, visit the U.S. Department of Labor’s Office of Federal Contract Compliance Programs (OFCCP) website at www.dol.gov/ofccp .

How do you know if you have a disability?

A disability is a condition that substantially limits one or more of your “major life activities.” If you have or have ever had such a condition, you are a person with a disability. Disabilities include, but are not limited to:

  • Alcohol or other substance use disorder (not currently using drugs illegally)
  • Autoimmune disorder, for example, lupus, fibromyalgia, rheumatoid arthritis, HIV/AIDS
  • Blind or low vision
  • Cancer (past or present)
  • Cardiovascular or heart disease
  • Celiac disease
  • Cerebral palsy
  • Deaf or serious difficulty hearing
  • Diabetes
  • Disfigurement, for example, disfigurement caused by burns, wounds, accidents, or congenital disorders
  • Epilepsy or other seizure disorder
  • Gastrointestinal disorders, for example, Crohn's Disease, irritable bowel syndrome
  • Intellectual or developmental disability
  • Mental health conditions, for example, depression, bipolar disorder, anxiety disorder, schizophrenia, PTSD
  • Missing limbs or partially missing limbs
  • Mobility impairment, benefiting from the use of a wheelchair, scooter, walker, leg brace(s) and/or other supports
  • Nervous system condition, for example, migraine headaches, Parkinson’s disease, multiple sclerosis (MS)
  • Neurodivergence, for example, attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder, dyslexia, dyspraxia, other learning disabilities
  • Partial or complete paralysis (any cause)
  • Pulmonary or respiratory conditions, for example, tuberculosis, asthma, emphysema
  • Short stature (dwarfism)
  • Traumatic brain injury

Disability Status Select...

PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of 1995 no persons are required to respond to a collection of information unless such collection displays a valid OMB control number. This survey should take about 5 minutes to complete.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Data Scientist

HF Sinclair

Myrtle Point

Remote

USD 100,000 - 140,000

Today
Be an early applicant

Principal Data Scientist - Generative AI, Machine Learning, Python, R - Remote

Lensa

City of Yonkers

Remote

USD 117,000 - 276,000

Today
Be an early applicant

Senior Machine Learning Scientist, Recommendation and Search Systems

CookUnity

Remote

Remote

USD 150,000 - 200,000

30+ days ago

Senior Data Scientist, Machine Learning

Alma

Remote

USD 160,000 - 190,000

Today
Be an early applicant

Principal Data Scientist - Generative AI, Machine Learning, Python, R - Remote

Lensa

Tucson

Remote

USD 120,000 - 160,000

Today
Be an early applicant

Principal Data Scientist - Generative AI, Machine Learning, Python, R - Remote

Lensa

Austin

Remote

USD 117,000 - 276,000

Today
Be an early applicant

Principal Data Scientist - Generative AI, Machine Learning, Python, R - Remote

Lensa

Bellevue

Remote

USD 117,000 - 276,000

Today
Be an early applicant

Quantitative Researcher with Machine Learning Experience Work From Home - R

Jobs via Dice

Remote

USD 90,000 - 130,000

Today
Be an early applicant

Principal Data Scientist - Generative AI, Machine Learning, Python, R - Remote

Lensa

Warren

Remote

USD 117,000 - 276,000

Today
Be an early applicant