Enable job alerts via email!

Applied Research Engineer - Synthetic Data

techire ai

London

Hybrid

USD 200,000 - 350,000

Full time

15 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An ambitious AI startup is seeking an Applied Engineer to drive innovative data strategies for advanced AI systems. This role offers the chance to work with cutting-edge technology and significant compute resources, shaping the future of agentic AI. You'll develop synthetic multimodal datasets, optimize models for deployment, and collaborate with cross-functional teams to enhance AI capabilities. If you're passionate about AI and eager to make a lasting impact, this is a unique opportunity to join a rapidly growing team focused on responsible development and groundbreaking approaches in the field.

Benefits

significant equity
flexible working hours
access to massive GPU cluster
opportunity for remote work

Qualifications

  • Strong Python skills for large-scale deployments and system design.
  • Experience with multimodal data pipelines and LLMs/VLMs.

Responsibilities

  • Develop synthetic multimodal datasets for VQA and agent behaviors.
  • Optimize large-scale models for edge deployment using distillation techniques.

Skills

Python programming
parallel computing
system design
large-scale deployments
multimodal data pipelines
training LLMs
training VLMs
PyTorch
evaluation paradigms for multimodal models
fast-changing environments

Education

MSc in machine learning
PhD in computer vision
PhD in NLP

Tools

PyTorch

Job description

Shape the future of agentic AI through cutting-edge data strategy

Want to pioneer next-generation data techniques for advanced AI systems? This role combines frontier model research with practical implementation at one of Europe's most ambitious AI startups.

You'll join a rapidly growing AI Data team developing cutting-edge data-centric approaches that enhance LLMs, VLMs, and Action Models. This isn't just about collecting data - it's about transforming how AI systems learn and operate through synthetic generation, model distillation, and preference alignment.

Founded with a clear mission to push the boundaries of superintelligent agentic AI, this well-funded startup ($200M raised) is assembling world-class talent focused on both advancing capabilities and ensuring responsible development. Their approach is comprehensive - building proprietary technology from data to models, focusing on language, multimodal, and vision systems with superior performance and cost-effectiveness.

As an Applied Engineer focusing on Data Research, you'll develop sophisticated data strategies that directly impact frontier AI systems:

  1. Generate and augment synthetic multimodal datasets for VQA, agent behaviours, and virtual navigation
  2. Apply model distillation techniques to optimise large-scale models for edge deployment
  3. Design evaluation frameworks to measure improvements across multiple domains
  4. Lead research into aligning data with human and AI preferences
  5. Collaborate with cross-functional teams to integrate data-driven solutions

This role offers rare access to significant compute resources, with a massive GPU cluster that enables cutting-edge work. You'll be joining at a pivotal stage where your contributions will shape core technology and direction.

Requirements:

  1. Strong Python programming skills covering parallel computing, system design, and large-scale deployments
  2. Experience developing multimodal data pipelines
  3. Background in training and deploying LLMs, VLMs or PyTorch models
  4. MSc or PhD in machine learning, computer vision, NLP, or related field
  5. Deep understanding of training and evaluation paradigms for multimodal models
  6. Effectiveness in fast-changing environments

Nice to have:

  1. Experience with agent-specific data pipelines
  2. Background in multimodal human annotation platforms
  3. Document understanding/OCR expertise
  4. Synthetic data generation experience (particularly multimodal)

You'll have flexibility to work from New York, London, or remotely within European or US East Coast time zones. For those based in cities with offices, hybrid arrangements are available.

Your package includes a highly competitive salary ($200,000-$350,000 depending on experience) plus significant equity with strong upside potential.

If you're passionate about advancing AI through innovative data approaches and want to make a lasting impact on agentic systems, we'd love to hear from you. All applicants will receive a response.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Research Engineer, Chip Verification

The Rundown AI, Inc.

London

On-site

USD 166,000 - 244,000

Yesterday
Be an early applicant

Quantitative Developer/Research Engineer

TN United Kingdom

London

On-site

USD 125,000 - 350,000

12 days ago

Sr. Research Engineer, Machine Learning, AGI Foundations

Amazon

London

On-site

USD 151,000 - 262,000

2 days ago
Be an early applicant

Research Engineer, Gemini Personalization

The Rundown AI, Inc.

London

On-site

USD 166,000 - 244,000

23 days ago

Research Engineer / Scientist, Alignment Science, London

Anthropic

London

On-site

GBP 225,000 - 500,000

30+ days ago

Research Engineer / Research Scientist, Multimodal

Anthropic

London

Hybrid

GBP 250,000 - 270,000

30+ days ago

Research Engineer, Knowledge Team

Anthropic

London

Hybrid

GBP 250,000 - 340,000

30+ days ago

Rust Engineer - Distributed Systems

Understanding Recruitment

Greater London

On-site

USD 200,000 - 275,000

30+ days ago