Enable job alerts via email!

New York Data Engineer, Personalization

Spotify AB

New York (NY)

Hybrid

USD 160,000 - 229,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in the music and podcast industry is seeking a Data Engineer to enhance podcast and audiobook experiences. This role involves building scalable data pipelines, collaborating with cross-functional teams, and developing innovative solutions to improve data processing and integration. The position offers extensive learning opportunities and competitive benefits, including a substantial salary range and equity options.

Benefits

Flexible share incentives
Parental leave
Employee assistance programs
Flexible holidays

Qualifications

  • Experience building high-scale production data pipelines.
  • Proficient in data modeling and storage optimization.
  • Curiosity about ML data processing and LLM fine-tuning.

Responsibilities

  • Develop and maintain large-scale data pipelines.
  • Collaborate with ML engineers on training datasets.
  • Monitor AI product quality and performance metrics.

Skills

Data modeling
Data storage optimization
Data processing
Collaboration
Agile processes

Tools

Scio
Ray
Apache Beam
Google Cloud Platform

Job description

Personalization’s Hulk squad produces Human Understandable Language Knowledge to enrich content understanding. We utilize Large Language Models to understand podcasts and audiobooks, building reliable, scalable systems to distribute that knowledge to Spotify internal teams, users, and creators. We are looking for a Data Engineer with interest in ML techniques to join our team and help build the future of podcast and audiobook listening experiences for millions of listeners at Spotify. This is a unique opportunity to help develop and shape Spotify recommendations. You’ll grow your skills in engineering at scale, work with a cross-functional team of Machine Learning Engineers, Backend Engineers, Data Engineers, and researchers, and join a super motivated and supportive team.

Location
  • New York
Job type
What You'll Do
  1. Work with large-scale data pipelines using data processing frameworks like Scio (built on Apache Beam), Hendrix ML, Ray, PyTorch, Dataflow, and other Google Cloud Platform offerings.
  2. Support streaming and batch inference of core LLM models.
  3. Develop new data pipelines to meet emerging needs and maintain existing data systems for reliability, scalability, and testing.
  4. Collaborate with ML engineers to develop training and evaluation datasets for Generative LLM models.
  5. Own the development of data solutions from architecture to delivery.
  6. Work with other engineers, data scientists, and stakeholders to ensure seamless integration and performance.
  7. Develop monitoring tools to automatically track AI product quality, including the use of LLMs for evaluation and performance metrics dashboards.
  8. Participate in an agile team to experiment, iterate, and deliver on product objectives.
  9. Share knowledge and promote best practices through mentorship and accountability.
Who You Are
  • You are passionate about data, proficient in data modeling, access, and storage optimization.
  • You have experience building high-scale production data pipelines with frameworks like Scio and Ray.
  • You might have worked with orchestration frameworks such as Flyte.
  • You care about agile processes, data-driven development, reliability, and responsible experimentation.
  • You value collaboration within teams.
  • You have experience or curiosity about ML data processing, inference pipelines, LLM fine-tuning, and data analysis using LLMs.
  • Bonus if experienced with NRT systems and prompt engineering tools like DSPy.
Where You'll Be
  • Flexible work location within North America, aligned with the Eastern Standard time zone for collaboration.

We offer extensive learning opportunities, flexible share incentives, parental leave, employee assistance programs, flexible holidays, and more benefits. The US base salary range is $160,091 - $228,702 plus equity, with benefits including health insurance, paid parental leave, 401(k), meal allowance, paid time off, and holidays.

Spotify is an equal opportunity employer committed to inclusivity. We support accommodations during the recruitment process. Join us in revolutionizing the way the world listens, driven by our passion for music and podcasting.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Data Engineer, Personalization

Spotify

New York

Remote

USD 160,000 - 229,000

Yesterday
Be an early applicant

Python and Kubernetes Software Engineer - Data, AI/ML & Analytics

Canonical

New York

Remote

USD 140,000 - 185,000

4 days ago
Be an early applicant

Senior Data Engineer

Jobot

New York

Remote

USD 150,000 - 200,000

5 days ago
Be an early applicant

Sr. Staff Data Engineer

NBCUniversal

Englewood Cliffs

Remote

USD 150,000 - 185,000

Yesterday
Be an early applicant

Senior Data Engineer

Jobot

Newark

Remote

USD 150,000 - 200,000

5 days ago
Be an early applicant

Senior Machine Learning Engineer - Gen AI and Product Applications

Anomalo

New York

Remote

USD 210,000 - 265,000

Today
Be an early applicant

Senior Machine Learning Engineer - Gen AI and Product Applications

Anomalo

New York

Remote

USD 210,000 - 265,000

Yesterday
Be an early applicant

Senior Data Scientist

Alkymi Inc

New York

Remote

USD 151,000 - 185,000

Yesterday
Be an early applicant

Principal Data Scientist - Generative AI, Machine Learning, Python, R - Remote

Lensa

City of Yonkers

Remote

USD 117,000 - 276,000

Yesterday
Be an early applicant