Enable job alerts via email!

Lead Data Engineer (Remote)

Circana

Cape Town

Remote

ZAR 1,563,000 - 2,086,000

Full time

Yesterday
Be an early applicant

Job summary

A global data solutions company is seeking a skilled Data Engineer who will be responsible for building and maintaining data pipelines on the Azure platform. This involves leveraging technologies like PySpark, Apache Spark, and Airflow to ensure data quality and efficiency. The role offers a flexible environment and the opportunity to lead a team of data engineers. Excellent communication skills are essential as this is a client-facing position.

Qualifications

  • Client-facing role requiring strong communication.
  • Experience with data engineering and data pipelines.
  • Strong programming skills and ability to write efficient code.

Responsibilities

  • Design and develop scalable data workflows using Python.
  • Manage cloud infrastructure for cost optimization.
  • Lead a team of data engineers and provide mentorship.

Skills

Communication and collaboration skills
Data engineering expertise in Azure
Proficient in PySpark
Proficient in Apache Spark
Expertise in Airflow
Strong in Python programming
SQL proficiency
Problem-solving skills
Experience with Agile/Scrum

Tools

Git
Docker
Kubernetes
Terraform
Job description
Overview

At Circana, we are fueled by our passion for continuous learning and growth, we seek and share feedback freely, and we celebrate victories both big and small in an environment that is flexible and accommodating to our work and personal lives. We have a global commitment to diversity, equity, and inclusion as we believe in the undeniable strength that diversity brings to our business, employees, clients, and communities. With us, you can always bring your full self to work. Join our inclusive, committed team to be a challenger, own outcomes, and stay curious together. Circana is proud to be Certified by Great Place To Work. This prestigious award is based entirely on what current employees say about their experience working at Circana.

Learn more at .

What will you be doing?

We are seeking a skilled and motivated Data Engineer to join a growing team Global Team based in the UK. In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, Apache Spark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and a desire to make a significant impact, we encourage you to apply!

Responsibilities
  • Data Engineering & Data Pipeline Development: Design, develop, and optimize scalable DATA workflows using Python, PySpark, and Airflow
  • Implement real-time and batch data processing using Spark
  • Enforce best practices for data quality, governance, and security throughout the data lifecycle
  • Ensure data availability, reliability and performance through monitoring and automation
Cloud Data Engineering
  • Manage cloud infrastructure and cost optimization for data processing workloads
  • Implement CI / CD pipelines for data workflows to ensure smooth and reliable deployments
Big Data & Analytics
  • Build and optimize large-scale data processing pipelines using Apache Spark and PySpark
  • Implement data partitioning, caching, and performance tuning for Spark-based workloads
  • Work with diverse data formats (structured and unstructured) to support advanced analytics and machine learning initiatives
Workflow Orchestration (Airflow)
  • Design and maintain DAGs (Directed Acyclic Graphs) in Airflow to automate complex data workflows
  • Monitor, troubleshoot, and optimize job execution and dependencies
Team Leadership & Collaboration
  • Lead a team of data engineers, providing technical guidance and mentorship
  • Foster a collaborative environment and promote best practices for coding standards, version control, and documentation
Requirements
  • This is a client facing role, strong communication and collaboration skills are vital
  • Experience in data engineering with expertise in Azure, PySpark, Spark, and Airflow
  • Strong programming skills in Python, SQL with the ability to write efficient and maintainable code
  • Deep understanding of Spark internals (RDDs, DataFrames, DAG execution, partitioning, etc.)
  • Experience with Airflow DAGs, scheduling, and dependency management
  • Knowledge of Git, Docker, Kubernetes, Terraform, and apply best practices of DevOps for CI / CD workflows
  • Excellent problem-solving skills and ability to optimize large-scale data processing
  • Experience in leading teams and working in Agile / Scrum environments
  • A proven track record of working effectively global remote teams
Desirable
  • Experience with data modelling and data warehousing concepts
  • Familiarity with data visualization tools and techniques
  • Knowledge of machine learning algorithms and frameworks
Circana Behaviours

As well as the technical skills, experience and attributes that are required for the role, our shared behaviours sit at the core of our organization. Therefore, we always look for people who can continuously champion these behaviours throughout the business within their day-to-day role :

  • Stay Curious : Being hungry to learn and grow, always asking the big questions
  • Seek Clarity : Embracing complexity to create clarity and inspire action
  • Own the Outcome : Being accountable for decisions and taking ownership of our choices
  • Centre on the Client : Relentlessly adding value for our customers
  • Be a Challenger : Never complacent, always striving for continuous improvement
  • Champion Inclusivity : Fostering trust in relationships engaging with empathy, respect, and integrity
  • Commit to each other : Contributing to making Circana a great place to work for everyone
Location

This position can be located in the following area(s) : Remote or Bracknell, UK

LI-KM1

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.