Aktiviere Job-Benachrichtigungen per E-Mail!

Data Scientist | Remote Work

Mercor

Remote

EUR 80.000 - 100.000

Teilzeit

Vor 3 Tagen

Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A leading AI consulting firm is looking for an AI Task Evaluation & Statistical Analysis Specialist to conduct thorough statistical analysis of AI agent failures and enhance evaluation methods. Candidates should have a strong background in statistical analysis and proficiency in Python, alongside experience in data visualization tools. This remote position offers competitive pay of $100 to $120 per hour, emphasizing the importance of collaboration with technical teams to improve AI model performance.

Qualifikationen

Strong foundation in statistical analysis and hypothesis testing.
Proficiency in Python for data analysis.
Experience with exploratory data analysis and generating insights.
Familiarity with LLM evaluation methods and metrics.

Aufgaben

Conduct statistical failure analysis for AI agent tasks.
Perform root cause analysis on design and rubric clarity.
Analyze performance variations across finance sub-domains.
Create dashboards to highlight failure patterns and improvements.
Recommend task design improvements based on findings.
Present insights to teams to drive improvements.

Kenntnisse

Statistical Analysis

Python

Exploratory Data Analysis

Data Visualization

SQL

Tools

Excel

Tableau

Looker

Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey.

Position

AI Task Evaluation & Statistical Analysis Specialist

Type

Contract

Compensation

$100–$120/hour

Location

Remote

Role Responsibilities

Conduct comprehensive statistical failure analysis to identify patterns in AI agent failures across task components such as prompts, rubrics, and templates.
Perform root cause analysis to determine if failures are due to task design, rubric clarity, file complexity, or agent limitations.
Analyze performance variations across finance sub-domains, file types, and task categories to enhance understanding of AI model performance.
Create dashboards and reports to highlight failure clusters, edge cases, and improvement opportunities.
Recommend improvements to task design, rubric structure, and evaluation criteria based on statistical findings.
Present insights to data labeling experts and technical teams to foster collaboration and drive improvements.

Qualifications

Must-Have

Statistical Expertise: Strong foundation in statistical analysis, hypothesis testing, and pattern recognition.
Programming: Proficiency in Python (pandas, scipy, matplotlib/seaborn) or R for data analysis.
Data Analysis: Experience with exploratory data analysis and creating actionable insights from complex datasets.
AI/ML Familiarity: Understanding of LLM evaluation methods and quality metrics.
Tools: Comfortable working with Excel, data visualization tools (Tableau/Looker), and SQL.

Preferred

Experience with AI/ML model evaluation or quality assurance.
Background in finance or willingness to learn finance domain concepts.
Experience with multi-dimensional failure analysis.
Familiarity with benchmark datasets and evaluation frameworks.
2-4 years of relevant experience.

Application Process (Takes 20–30 mins to complete)

Upload resume
AI interview based on your resume
Submit form

Resources & Support

For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome/welcome
For any help or support, reach out to: support@mercor.com

PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Städte

Top-Unternehmen

Beliebte Jobs