Enable job alerts via email!

Senior Data Engineer (with Python, PySpark and AWS)

Luxoft

Camden Town

On-site

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Job summary

A leading technology company in the UK is seeking a Senior Data Engineer to build and maintain scalable ETL pipelines using Python and PySpark. You will develop and optimize data workflows on AWS, design Snowflake data warehouses, and ensure data integrity and accuracy. Ideal candidates have over 5 years of experience and strong skills in AWS services. This role requires excellent communication and collaboration across teams.

Qualifications

  • 5+ years of experience in data engineering or software development.
  • Excellent English communication skills (C1 Advanced).
  • Experience with data governance frameworks and security best practices.

Responsibilities

  • Build and maintain scalable ETL pipelines using Python and PySpark.
  • Develop and optimize distributed data workflows on AWS EMR.
  • Design, implement, and tune Snowflake data warehouses.
  • Automate data workflows to improve efficiency.
  • Monitor pipeline performance and resolve issues proactively.

Skills

Strong proficiency in Python
Strong proficiency in PySpark
Hands-on experience with AWS EMR
Deep understanding of Snowflake
Solid grasp of data modeling
Excellent communication skills

Tools

AWS S3
AWS Lambda
AWS Glue
Git
Terraform
CloudFormation
Jenkins
Job description

Build and maintain scalable ETL pipelines using Python and PySpark to support data ingestion, transformation, and integration.

  • Develop and optimize distributed data workflows on AWS EMR for high-performance processing of large datasets.
  • Design, implement, and tune Snowflake data warehouses to support analytical workloads and reporting needs.
  • Partner with data scientists, analysts, and product teams to deliver reliable, well-documented datasets.
  • Ensure data integrity, consistency, and accuracy across multiple sources and systems.
  • Automate data workflows and processes to improve efficiency and reduce manual intervention.
  • Monitor pipeline performance, identify bottlenecks, and resolve issues proactively.
  • Apply best practices in CI/CD, version control (e.g., Git), and infrastructure-as-code (e.g., Terraform, CloudFormation).
  • Enforce data security, compliance, and governance standards in line with industry regulations.
  • Mentor junior engineers, conduct code reviews, and foster a culture of continuous learning and knowledge-sharing.

We're seeking a highly skilled and motivated Senior Data Engineer to join our growing data team. In this role, you'll architect and maintain robust, scalable data pipelines and infrastructure that power our analytics, machine learning, and business intelligence initiatives. You'll work with cutting-edge technologies like Python, PySpark, AWS EMR, and Snowflake, and collaborate across teams to ensure data is clean, reliable, and actionable.

5+ years of experience in data engineering or software development.

Qualifications
  • Strong proficiency in Python and PySpark.
  • Hands‑on experience with AWS services, especially EMR, S3, Lambda, and Glue.
  • Deep understanding of Snowflake architecture and performance tuning.
  • Solid grasp of data modeling, warehousing concepts, and SQL optimization.
  • Familiarity with CI/CD tools (e.g., Jenkins, GitHub Actions) and infrastructure-as-code.
  • Experience with data governance frameworks and security best practices.
  • Excellent communication and collaboration skills, English: C1 Advanced.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.