Enable job alerts via email!
A leading technology company is seeking an experienced Data Science Lead to spearhead product experimentation and evaluation frameworks. In this role, you will manage and mentor teams while driving a culture of rapid iteration and learning. Ideal candidates will have extensive experience in data science, particularly in startups and large-scale environments, and demonstrate strong technical and communication skills. The position is remote and offers a competitive salary range.
You will be a strategic partner to product, engineering, and trust and safety teams, responsible for defining evaluation frameworks, leading experiments (A/B, quasi-experiments, etc.), and turning offline and live model performance into product improvements. This role requires a strong track record in startup-style experimentation (moving quickly with scrappy but rigorous methods) and product experimentation at scale. The ideal candidate will also bring proven experience in leading and managing teams to deliver high-impact data science work.
100% remote
Salary Range $120,000 - $260,000
Essential Job Functions
● Lead end-to-end experimentation: hypothesis generation, metric design, experiment design (A/B, multivariate, sequential, etc.), analysis, and interpretation.
● Build and maintain evaluation frameworks for LLMs: correctness, consistency, safety, hallucination detection, bias/fairness, etc.
● Develop predictive models, classification/ranking systems, and heuristics to improve product features related to AI/language generation.
● Collaborate with prompt engineers & model builders to test prompt strategies, fine-tuning, or model selection; work on failure modes/error analysis.
● Automate experiment pipelines: dashboards, monitoring, alerting, instrumentation. Ensure data quality & measurement integrity.
● Use causal inference / observational studies when randomized experiments are not feasible.
● Present findings and recommendations to both technical and non-technical leadership; influence roadmap decisions.
● Drive experimentation in startup-like environments: rapid iteration, learning from limited data, and balancing speed with rigor.
● Shape large-scale product experimentation: define frameworks for experimentation at scale and integrate results into product strategy.
● Lead and mentor teams of data scientists, analysts, and engineers; set best practices for experiment design and AI product evaluation.
Preferred Qualifications