Enable job alerts via email!

Data Engineer

ATG (Auction Technology Group)

City Of London

On-site

GBP 60,000 - 80,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in the UK is looking for a Data Engineer to build and maintain scalable data pipelines that support analytics and ML applications. The ideal candidate will have 5+ years of experience in data engineering, strong programming skills in Python, and expert-level SQL capabilities. This role will require collaboration with various teams to ensure data quality and accessibility, making it essential for the successful candidate to have robust problem-solving skills.

Qualifications

5+ years of experience building and maintaining data pipelines in production environments.
Expert-level SQL skills with performance tuning experience.
Hands-on experience with AWS services including S3, Redshift and Glue.

Responsibilities

Design, build, and maintain ETL/ELT pipelines for analytics and ML models.
Implement workflow orchestration for complex data dependencies.
Monitor pipeline health and data freshness.

Skills

Python programming

SQL

Data processing libraries (Pandas, PySpark)

Workflow orchestration tools (Airflow, Dagster)

AWS cloud services

Data warehousing solutions (Redshift, Snowflake)

Education

BSc or MSc in Computer Science, Data Engineering or related field

Tools

Git

Apache Spark

Docker

You have a passion for building scalable, reliable data systems that enable data scientists, ML engineers, and analysts to do their best work. You understand that great data products require more than just moving data; they need robust pipelines, data quality assurance, and thoughtful architecture. Not only do you put reliability and scalability at the heart of everything you do, but you are adept at enabling data-driven decisions through proper data modeling and pipeline design. You will be comfortable working cross-functionally with Product, Engineering, Data Science, Analytics, and MLOps teams to develop our products and improve the end-user experience. You should have a strong track record of successful prioritization, meeting critical deadlines, and enthusiastically tackling challenges with an eye toward problem solving.

Key Responsibilities

Data Pipeline Development & Management
Design, build, and maintain robust ETL/ELT pipelines that support analytics, ML models, and business intelligence
Develop scalable batch and streaming data pipelines to process millions of auction events, user interactions, and transactions daily
Implement workflow orchestration using Airflow, Dagster, or similar tools to manage complex data dependencies
Build data validation and quality monitoring frameworks to ensure data accuracy and reliability
ML & Analytics Infrastructure
Build feature engineering pipelines to support ML models for search, recommendations, and personalization
Integrate with feature stores to enable consistent feature computation across training and inference
Create datasets for model training, validation, and testing with proper versioning
Data Quality & Monitoring
Implement comprehensive data quality checks, anomaly detection, and alerting systems
Monitor pipeline health, data freshness, and SLA compliance
Create dashboards and reporting tools for data pipeline observability
Debug and resolve data quality issues and pipeline failures
Collaboration & Best Practices
Work closely with Data Scientists and ML Engineers to understand data requirements and deliver reliable datasets
Partner with Software Engineers to integrate data pipelines with application systems
Establish and document data engineering best practices, coding standards, and design patterns
Mentor junior engineers on data engineering principles and best practices

Key Requirements

Required Qualifications: BSc or MSc in Computer Science, Data Engineering, Software Engineering, or a related field, or equivalent practical experience
5+ years of experience building and maintaining data pipelines and infrastructure in production environments
Strong programming skills in Python, with experience in data processing libraries (Pandas, PySpark)
Expert-level SQL skills with experience in query optimization and performance tuning
Proven experience with workflow orchestration tools (Airflow, Dagster, Prefect, or similar)
Hands‑on experience with cloud platforms (AWS preferred) including S3, Redshift, EMR, Glue, Lambda
Experience with data warehousing solutions (Redshift, Snowflake, BigQuery, or similar)
Experience with version control systems (Git) and CI/CD practices for data pipelines

Technical Skills

Experience with distributed computing frameworks (Apache Spark, Dask, or similar)
Knowledge of both batch and streaming data processing (Kafka, Kinesis, or similar)
Familiarity with data formats (Parquet, ORC, Avro, JSON) and their trade-offs
Understanding of data quality frameworks and testing strategies
Previous work with vector databases (Pinecone, Milvus, etc)
Experience with monitoring and observability tools (Prometheus, Grafana, CloudWatch)
Knowledge of infrastructure-as-code tools (Terraform, CloudFormation)
Understanding of containerization (Docker) and orchestration (Kubernetes) is a plus

Nice-to-Have

Familiarity with dbt (data build tool) for data transformation workflows
Knowledge of Elasticsearch or similar search technologies
Experience in eCommerce, marketplace, or auction platforms
Understanding of GDPR, data privacy, and compliance requirements
Experience with real-time analytics and event-driven architectures (Flink, Materialize)

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.