Enable job alerts via email!

AIML - Machine Learning Engineer, Model Evaluations

Apple Inc.

Cupertino (CA)

On-site

USD 175,000 - 313,000

Full time

11 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology company is seeking a Machine Learning Engineer focused on model evaluations in Cupertino. This role involves developing metrics for generative AI safety, collaborating across teams, and ensuring data quality through rigorous scientific methods. Candidates should have advanced degrees in relevant fields and extensive experience in data science and machine learning.

Benefits

Comprehensive medical and dental coverage

Retirement benefits

Employee stock purchase program

Educational reimbursement

Discretionary bonuses or commission payments

Relocation assistance

Qualifications

4+ years of relevant experience or BA/BS with 8+ years.
Experience in collecting and analyzing language, image, or multi-modal data.
Designing human annotation projects and writing guidelines.

Responsibilities

Develop metrics for evaluating safety and fairness risks in generative models.
Collaborate with engineering, product, and research teams for aligned evaluations.
Build expertise in machine translation and data synthesis techniques.

Skills

Data analysis

Machine learning

Interpersonal skills

Education

MS or PhD in Computer Science, Linguistics, Cognitive Science, HCI, Psychology, Mathematics, Physics

BA/BS with 8+ years of relevant work experience

Tools

Python

Pandas

Visualization libraries

AIML - Machine Learning Engineer, Model Evaluations

Cupertino, California, United States Machine Learning and AI

Add to Favorites AIML - Machine Learning Engineer, Model Evaluations

Description

Apple Intelligence is driven by intentional data design—spanning careful sampling, creation, and curation of high-quality datasets, enriched with precise annotations. Our data powers our ability to evaluate and mitigate safety risks in new generative AI features. This role sits at the intersection of applied data science, empirical analysis, cultural and linguistic expertise, and stakeholder communication. It requires strong scientific judgment, cross-functional collaboration, and the ability to translate evaluation findings into actionable insights.- Develop metrics for evaluation of safety and fairness risks inherent to generative models and Gen-AI features- Design datasets, identify data needs, and work on creative solutions, scaling and expanding data coverage through human and synthetic generation methods- Collaborate with cross-functional partners—including engineering, product, and research teams—to ensure evaluations align with feature goals and deployment plans- Partner with policy teams to translate regional safety and inclusivity requirements into measurable evaluation criteria- Build expertise in machine translation and data synthesis techniques to generate localized and culturally aligned evaluation datasets at scale- Develop ML-based enhancements to red teaming, model evaluation, and other processes to improve the quality of Apple Intelligence’s user-facing products- Work with highly-sensitive content with exposure to offensive and controversial content

Minimum Qualifications

MS or PhD in Computer Science, Linguistics, Cognitive Science, HCI, Psychology, Mathematics, Physics, or a similar science or technology field with a strong basis in scientific data collection and analysis + at least 4 years of relevant work experience, or BA/BS with 8+ years of relevant work experience
Experience collecting and analyzing language data, image data, and/or multi-modal data
Strong experience designing human annotation projects, writing guidelines, and dealing with highly multi-labeled, nuanced, and often conflicting data
Proficiency in data science, machine learning, analytics, and programming with Python & Pandas; strong experience with one or more plotting & visualization libraries
Excellent interpersonal skills, with a proven ability to synthesize complex findings and present evaluation outcomes to senior leadership and executives
Strong skills for rigorous model quality metrics development; interpretation of experiments and evaluations; and presentation to executives

Preferred Qualifications

Deep cultural awareness and understanding of regional norms, values, and sensitivities, with the ability to translate this knowledge into actionable evaluation strategies
Experience in localization, internationalization, or building/evaluating machine learning systems for global markets, with a focus on linguistic and cultural adaptation
Curiosity about fairness and bias in generative AI systems, and a strong desire to help make the technology more equitable

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $175,800 and $312,200, and your base pay will depend on your skills, qualifications, experience, and location.

Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits.

Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .

Add to Favorites AIML - Machine Learning Engineer, Model Evaluations

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs