Enable job alerts via email!

Senior Software Data Engineer

Madfish

United Kingdom

Remote

GBP 60,000 - 80,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading SaaS company in the United Kingdom seeks a Senior Software Data Engineer to design and optimize scalable data pipelines using PySpark and AWS. Candidates should have over 5 years of experience and strong expertise in Python, SQL, and automated testing. The role involves delivering accurate and reliable data outputs while collaborating across cross-functional teams. This remote position offers a full-time 6-month contract with potential for extension.

Qualifications

5+ years of professional experience as a Data Engineer.
Strong expertise in PySpark, Python, and SQL.
Experience with AWS data ecosystem.

Responsibilities

Design, build, and optimize scalable data pipelines using PySpark and AWS services.
Deliver production-grade data outputs with high accuracy and reliability.
Develop automated testing frameworks to support end-to-end data quality.

Skills

PySpark

Python

SQL

AWS

Automated testing

Debugging

Performance optimization

MLflow

Tools

Databricks (DBX)

Location: Remote
Job Type: Full-Time (6-month contract with possibility of extension)

About Us

We are a SaaS company that collects large-scale web data, analyzes it, and transforms it into actionable consumer insights for global brands.
Our offerings include:

Data-driven dashboards for eCommerce, product development, and social platforms
Classified catalogs of products, reviews, and social content (posts, videos, comments, etc.)
Data drops and analytical outputs used by enterprise clients

We work with massive datasets and cutting-edge technologies, and we value collaboration, problem-solving, and continuous learning.

Role Overview

We are looking for a highly skilled Senior Software Data Engineer to design, build, and optimize scalable data pipelines using AWS and the Databricks (DBX) ecosystem.

You will play a key role in ensuring the accuracy, reliability, and timeliness of our data outputs while contributing to our ML, MLflow, and LLM-driven capabilities.

You will collaborate closely with cross-functional teams including R&D, Product, and Delivery to validate features, troubleshoot issues, and deliver high-quality insights to clients.

Key Responsibilities

Design, build, and optimize scalable data pipelines using PySpark and AWS services
Deliver production-grade data outputs with high accuracy and reliability
Develop automated testing frameworks to support end-to-end data quality
Integrate ML, MLflow, and LLM-based workflows into data pipelines
Troubleshoot and resolve complex data and pipeline-related issues
Collaborate with Product Managers and Delivery Analysts to ensure release readiness
Maintain clear documentation and promote best practices across data engineering
Contribute to continuous improvement of our data infrastructure and workflows

Requirements

5+ years of professional experience as a Data Engineer
Strong expertise in PySpark, Python, and SQL
Experience with AWS data ecosystem
Practical background in automated testing and QA for data pipelines
Strong debugging and performance optimization skills
Experience working with Databricks (DBX)
Excellent communication skills in English and ability to collaborate across teams

Nice to Have

Experience working with big data and data lake architectures
Familiarity with CI/CD and DevOps practices
Experience with MLflow or LLM-driven pipelines
Knowledge of data governance and monitoring frameworks

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs