Enable job alerts via email!

Data and AI - LLM Model Developer

Randstad Digital

Remote

GBP 70,000 - 90,000

Part time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading IT recruitment company is looking for a Lead PySpark Engineer to oversee a large-scale data modernization project. This role entails migrating legacy data workflows to a high-performance AWS environment, with responsibilities including coding conversion and performance tuning. The ideal candidate will have 5+ years of experience in PySpark, strong AWS knowledge, and the ability to implement clean coding principles. This position is fully remote within the UK, offering competitive benefits.

Benefits

33 days holiday entitlement (pro-rata)

Qualifications

5+ years of hands-on experience with PySpark and Spark SQL.
Strong proficiency in AWS services including EMR, Glue, S3, and Athena.
Solid foundation in SAS for understanding legacy code conversion.

Responsibilities

Lead the migration of SAS code to PySpark using automated tools.
Design and build complex ETL/ELT workflows on AWS.
Optimise Spark workloads for performance and cost-effectiveness.

Skills

PySpark

AWS Data Stack

Data Modeling

Tools

Git

CI/CD

Lead PySpark Engineer (Cloud Migration)

Role Type: 5-Month Contract

Location: Remote (UK-Based)

Experience Level: Lead / Senior (5+ years PySpark)

Role Overview

We are seeking a Lead PySpark Engineer to drive a large-scale data modernisation project, transitioning legacy data workflows into a high-performance AWS cloud environment. This is a hands‑on technical role focused on converting legacy SAS code into production-ready PySpark pipelines within a complex financial services landscape.

Key Responsibilities

Code Conversion: Lead the end-to-end migration of SAS code (Base SAS, Macros, DI Studio) to PySpark using automated tools (SAS2PY) and manual refactoring.
Pipeline Engineering: Design, build, and troubleshoot complex ETL/ELT workflows and data marts on AWS.
Performance Tuning: Optimise Spark workloads for execution efficiency, partitioning, and cost-effectiveness.
Quality Assurance: Implement clean coding principles, modular design, and robust unit/comparative testing to ensure data accuracy throughout the migration.
Engineering Excellence: Maintain Git‑based workflows, CI/CD integration, and comprehensive technical documentation.

Technical Requirements

PySpark (P3): 5+ years of hands‑on experience writing scalable, production‑grade PySpark/Spark SQL.
AWS Data Stack (P3): Strong proficiency in EMR, Glue, S3, Athena, and Glue Workflows.
SAS Knowledge (P1): Solid foundation in SAS to enable the understanding and debugging of legacy logic for conversion.
Data Modeling: Expertise in ETL/ELT, dimensions, facts, SCDs, and data mart architecture.
Engineering Quality: Experience with parameterisation, exception handling, and modular Python design.

Additional Details

Industry: Financial Services experience is highly desirable.
Working Pattern: Fully remote with internal team collaboration days.
Benefits: 33 days holiday entitlement (pro‑rata).

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions