Job Search and Career Advice Platform

Enable job alerts via email!

Member of Technical Staff, Data Engineering

Jobgether

United Kingdom

Hybrid

GBP 60,000 - 80,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A dynamic tech company is seeking a Member of Technical Staff, Data Engineering to design and develop scalable data pipelines for AI systems. This role involves managing diverse datasets and collaborating with global teams to enhance model performance. Ideal candidates should have strong Python skills and experience with large-scale data processing frameworks. The position offers flexible remote options and a range of benefits including health coverage and vacation time.

Benefits

Open and inclusive work culture
Weekly lunch stipends
Comprehensive health benefits
Parental leave top-up
Personal enrichment budget
Remote-flexible work options
6 weeks of vacation

Qualifications

  • Strong software engineering skills, particularly in Python.
  • Experience building and maintaining large-scale data pipelines.
  • Familiarity with data processing frameworks such as Apache Spark, Apache Beam, or equivalent.
  • Experience with large-scale web datasets.
  • Excellent collaboration and communication skills.

Responsibilities

  • Design, develop, and maintain scalable data pipelines.
  • Conduct data ablations and experiments to assess quality.
  • Implement robust data modeling techniques for datasets.
  • Research and apply innovative data curation strategies.
  • Collaborate with global teams to meet evolving needs.

Skills

Strong software engineering skills, particularly in Python
Experience building and maintaining large-scale data pipelines
Familiarity with data processing frameworks such as Apache Spark
Experience with large-scale web datasets
Excellent collaboration and communication skills

Tools

Apache Spark
Pandas
Pandas
Job description

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Member of Technical Staff, Data Engineering in United States, United Kingdom, France, Canada.

This role offers the opportunity to shape the foundation of cutting‑edge AI systems by managing and optimizing the data pipelines that power advanced language models. You will design and build scalable pipelines, curate high‑quality datasets, and ensure data is structured for optimal training efficiency. Working with diverse sources like web data, code repositories, and multilingual corpora, you will bridge research and engineering, enabling faster, more reliable model training. This position operates in a collaborative, fast‑paced environment where your contributions directly influence AI model performance and innovation. Flexible remote options are available, and you will interact closely with researchers, engineers, and cross‑functional teams globally.

Accountabilities
  • Design, develop, and maintain scalable data pipelines for ingestion, parsing, filtering, and optimization of diverse datasets.
  • Conduct data ablations and experiments to assess quality and improve model performance.
  • Implement robust data modeling techniques to structure and format datasets for efficient training.
  • Research and apply innovative data curation strategies to support advancements in natural language processing.
  • Collaborate with researchers, engineers, and cross‑functional teams to meet the evolving needs of AI models.
  • Ensure datasets are diverse, reliable, and optimized for throughput and accelerator utilization.
Requirements
  • Strong software engineering skills, particularly in Python.
  • Experience building and maintaining large‑scale data pipelines.
  • Familiarity with data processing frameworks such as Apache Spark, Apache Beam, Pandas, or equivalent.
  • Experience working with large‑scale web datasets (e.g., CommonCrawl).
  • Passion for combining research and engineering to solve complex data challenges in AI.
  • Excellent collaboration and communication skills to work effectively across global teams.
Nice to Have
  • Publications at top‑tier AI and ML venues (NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, COLING, ACL, EMNLP).
  • Experience with multilingual corpora and diverse data sources.
  • Background in NLP or generative AI research.
Benefits
  • Open and inclusive work culture with global collaboration opportunities.
  • Weekly lunch stipends, in‑office meals, and snacks.
  • Comprehensive health, dental, and mental health benefits.
  • 100% parental leave top‑up for up to six months.
  • Personal enrichment budget for arts, culture, fitness, well‑being, and workspace improvements.
  • Remote‑flexible work options with offices in Toronto, New York, San Francisco, London, and Paris, including co‑working stipends.
  • 6 weeks (30 working days) of vacation.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.