Enable job alerts via email!

Data Engineer

ATG (Auction Technology Group)

City Of London

On-site

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in the UK is looking for a Data Engineer to build and maintain scalable data pipelines that support analytics and ML applications. The ideal candidate will have 5+ years of experience in data engineering, strong programming skills in Python, and expert-level SQL capabilities. This role will require collaboration with various teams to ensure data quality and accessibility, making it essential for the successful candidate to have robust problem-solving skills.

Qualifications

  • 5+ years of experience building and maintaining data pipelines in production environments.
  • Expert-level SQL skills with performance tuning experience.
  • Hands-on experience with AWS services including S3, Redshift and Glue.

Responsibilities

  • Design, build, and maintain ETL/ELT pipelines for analytics and ML models.
  • Implement workflow orchestration for complex data dependencies.
  • Monitor pipeline health and data freshness.

Skills

Python programming
SQL
Data processing libraries (Pandas, PySpark)
Workflow orchestration tools (Airflow, Dagster)
AWS cloud services
Data warehousing solutions (Redshift, Snowflake)

Education

BSc or MSc in Computer Science, Data Engineering or related field

Tools

Git
Apache Spark
Docker
Job description

You have a passion for building scalable, reliable data systems that enable data scientists, ML engineers, and analysts to do their best work. You understand that great data products require more than just moving data; they need robust pipelines, data quality assurance, and thoughtful architecture. Not only do you put reliability and scalability at the heart of everything you do, but you are adept at enabling data-driven decisions through proper data modeling and pipeline design. You will be comfortable working cross-functionally with Product, Engineering, Data Science, Analytics, and MLOps teams to develop our products and improve the end-user experience. You should have a strong track record of successful prioritization, meeting critical deadlines, and enthusiastically tackling challenges with an eye toward problem solving.

Key Responsibilities
  • Data Pipeline Development & Management
  • Design, build, and maintain robust ETL/ELT pipelines that support analytics, ML models, and business intelligence
  • Develop scalable batch and streaming data pipelines to process millions of auction events, user interactions, and transactions daily
  • Implement workflow orchestration using Airflow, Dagster, or similar tools to manage complex data dependencies
  • Build data validation and quality monitoring frameworks to ensure data accuracy and reliability
  • ML & Analytics Infrastructure
  • Build feature engineering pipelines to support ML models for search, recommendations, and personalization
  • Integrate with feature stores to enable consistent feature computation across training and inference
  • Create datasets for model training, validation, and testing with proper versioning
  • Data Quality & Monitoring
  • Implement comprehensive data quality checks, anomaly detection, and alerting systems
  • Monitor pipeline health, data freshness, and SLA compliance
  • Create dashboards and reporting tools for data pipeline observability
  • Debug and resolve data quality issues and pipeline failures
  • Collaboration & Best Practices
  • Work closely with Data Scientists and ML Engineers to understand data requirements and deliver reliable datasets
  • Partner with Software Engineers to integrate data pipelines with application systems
  • Establish and document data engineering best practices, coding standards, and design patterns
  • Mentor junior engineers on data engineering principles and best practices
Key Requirements
  • Required Qualifications: BSc or MSc in Computer Science, Data Engineering, Software Engineering, or a related field, or equivalent practical experience
  • 5+ years of experience building and maintaining data pipelines and infrastructure in production environments
  • Strong programming skills in Python, with experience in data processing libraries (Pandas, PySpark)
  • Expert-level SQL skills with experience in query optimization and performance tuning
  • Proven experience with workflow orchestration tools (Airflow, Dagster, Prefect, or similar)
  • Hands‑on experience with cloud platforms (AWS preferred) including S3, Redshift, EMR, Glue, Lambda
  • Experience with data warehousing solutions (Redshift, Snowflake, BigQuery, or similar)
  • Experience with version control systems (Git) and CI/CD practices for data pipelines
Technical Skills
  • Experience with distributed computing frameworks (Apache Spark, Dask, or similar)
  • Knowledge of both batch and streaming data processing (Kafka, Kinesis, or similar)
  • Familiarity with data formats (Parquet, ORC, Avro, JSON) and their trade-offs
  • Understanding of data quality frameworks and testing strategies
  • Previous work with vector databases (Pinecone, Milvus, etc)
  • Experience with monitoring and observability tools (Prometheus, Grafana, CloudWatch)
  • Knowledge of infrastructure-as-code tools (Terraform, CloudFormation)
  • Understanding of containerization (Docker) and orchestration (Kubernetes) is a plus
Nice-to-Have
  • Familiarity with dbt (data build tool) for data transformation workflows
  • Knowledge of Elasticsearch or similar search technologies
  • Experience in eCommerce, marketplace, or auction platforms
  • Understanding of GDPR, data privacy, and compliance requirements
  • Experience with real-time analytics and event-driven architectures (Flink, Materialize)
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.