Enable job alerts via email!

Data Engineer (AI-Driven Pipelines & Research)

Arrow

Manchester

Hybrid

GBP 40,000 - 65,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in data engineering is looking for a Data Engineer to build and scale data infrastructure for AI products. This role involves designing data pipelines, collaborating with AI teams, and utilizing advanced technologies to enhance data quality and automation. The ideal candidate will have strong Python skills and experience with various data formats, contributing to innovative AI-driven solutions.

Qualifications

  • Proficiency in Python and experience with data systems.
  • Familiarity with JSON, XML, CSV, and transforming unstructured data.
  • Interest or experience with LLMs for data tasks.

Responsibilities

  • Build and maintain reliable, scalable data pipelines in Python.
  • Utilize LLMs for data cleansing and enrichment.
  • Collaborate with product and AI teams for trustworthy data.

Skills

Python
Data Quality
Automation
Data Cleansing
Data Enrichment
Data Classification
Data Tagging
AI

Tools

AWS
pandas
Polars

Job description

Join to apply for the Data Engineer (AI-Driven Pipelines & Research) role at Arrow Global Group.

1 week ago Be among the first 25 applicants.

Description

Our Data Engineer will help build and scale data infrastructure powering our AI products. This hands-on, technically deep role is ideal for someone who values data quality, robustness, and automation. The engineer will collaborate with AI teams to design pipelines that not only move data but also clean, enrich, and understand it, increasingly leveraging large language models and automation agents.

About the Team

This team operates with a flat structure, a 'best-idea-wins' culture, and engineers influence product direction. We foster a supportive environment emphasizing ownership and responsibility, encouraging asking for help when needed. While based in Manchester and London, we offer flexible remote work within the UK, with opportunities for team gatherings and hackathons.

About the Role
  1. Build and maintain reliable, transparent, and scalable data pipelines in Python.
  2. Utilize LLMs for data cleansing, enrichment, classification, and tagging.
  3. Experiment with AI agents for automating research tasks and extracting structured data.
  4. Collaborate with product and AI teams to provide trustworthy data for prototypes.
  5. Design workflows transforming semi-structured data into actionable insights.
  6. Support rapid experimentation, shipping, and learning.
What We're Looking For & More
  • Proficiency in Python and pandas (or Polars) with a proven track record of delivering data systems.
  • Experience with JSON, XML, CSV, and transforming unstructured data.
  • Familiarity with AWS cloud-native tools like Lambda and Step Functions.
  • Interest or experience with LLMs for data tasks.
  • View pipelines as products: robust, debuggable, and continuously improving.
  • Curiosity about AI's role in automating research and discovery.

Beneficial (not essential) if you have experience with LangChain, Haystack, Pandas AI, vector databases, or projects involving agents for data understanding or automation.

Additional Info

To be considered, you must already have the right to work in the UK, as we cannot provide sponsorship.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Data Engineer (AI-Driven Pipelines & Research)

TN United Kingdom

Manchester

Hybrid

GBP 40.000 - 65.000

3 days ago
Be an early applicant