Job Search and Career Advice Platform

Enable job alerts via email!

Data Engineer - £350PD - Remote

Tenth Revolution Group

Remote

GBP 60,000 - 80,000

Part time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology solutions provider is seeking a Data Engineer to design and maintain ETL pipelines. You will need strong AWS skills, especially with Glue and S3, to ensure data quality and governance across various data processes. The ideal candidate will have experience in Python and SQL, with an emphasis on data validation and integration. This is a remote position offering competitive rates.

Qualifications

  • Experience in designing and maintaining ETL/ELT pipelines.
  • Hands-on experience with AWS Data Services.
  • Strong error handling and monitoring strategies.

Responsibilities

  • Design, build, and maintain robust ETL/ELT pipelines.
  • Implement data validation and data quality frameworks.
  • Optimize data processing and integration tasks.

Skills

Data Pipeline & ETL
AWS Glue
Amazon S3
Python
SQL

Tools

AWS Lambda
CockroachDB
Job description
Data Engineer - £350PD - Remote
Required Technical Skills
Data Pipeline & ETL
  • Design, build, and maintain robust ETL/ELT pipelines for structured and unstructured data
  • Hands‑on experience with AWS Glue and AWS Step Functions
  • Implementation of data validation, data quality frameworks, and reconciliation checks
  • Strong error handling, monitoring, and retry strategies in production pipelines
  • Experience with incremental data processing patterns (CDC, watermarking, upserts)
AWS Data Services
  • Amazon S3: data lake architectures, partitioning strategies, lifecycle policies
  • DynamoDB: data modeling, secondary indexes, streams, and performance optimization
  • Amazon Redshift: foundational querying, integrations, and performance considerations
  • AWS Lambda for scalable data processing and orchestration
  • Amazon EventBridge for event‑driven and decoupled data pipelines
Vector Databases & Embeddings
  • Strong understanding of vector database concepts, indexing strategies, and performance trade‑offs
  • Design and implementation of embedding generation pipelines
  • Optimization techniques for semantic search and retrieval accuracy
  • Effective chunking strategies for document ingestion and processing
  • Experience with CockroachDB deployment and management is beneficial
Document Processing
  • Experience with PDF parsing libraries such as PyPDF2, pdfplumber, and AWS Textract
  • Integration of OCR solutions (AWS Textract, Tesseract) for scanned documents
  • Extraction of document structure (headings, tables, sections)
  • Metadata extraction, normalization, and enrichment
  • Handling of multiple document formats including PDF, HTML, and DOCX
Data Integration
  • Familiarity with SAP data structures is beneficial
  • Integration with PIM (Product Information Management) systems
  • Design and consumption of REST APIs
Programming & Querying
  • Python (advanced): pandas, numpy, boto3, and data processing best practices
  • SQL (advanced): complex queries, performance tuning, and query optimisation
Data Quality & Governance
  • Data profiling and ongoing quality assessment
  • Schema validation and evolution strategies
  • Data lineage tracking and observability
  • Understanding of Master Data Management (MDM) concepts
Domain Knowledge
  • Product catalog data models and hierarchies
  • E‑commerce data patterns and integrations
  • B2B data exchange and system integration

To apply for this role please submit your CV or contact Dillon Blackburn on (phone number removed) or at (url removed).

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.