Enable job alerts via email!

Intermediate Data Engineer

k0deHut

Sandton

On-site

ZAR 600 000 - 800 000

Full time

Today
Be an early applicant

Job summary

A technology firm in Sandton, Gauteng is seeking an experienced Intermediate Data Engineer to design and maintain ETL pipelines. The ideal candidate will have strong skills in Python, SQL, and experience with big data platforms. You will collaborate closely with data scientists and software engineers to build scalable solutions for business applications. If you have a passion for data engineering and a commitment to excellence, we want to hear from you.

Qualifications

  • At least 3 years of proven experience as a Data Engineer.
  • Bachelor's or Masters degree in Computer Science or Engineering.
  • Experience with big data platforms and ETL tools.

Responsibilities

  • Design, build, and maintain ETL pipelines.
  • Optimize existing ETL pipelines for scalability.
  • Collaborate with team members to address data needs.

Skills

Python
SQL
Linux shell scripting
Data visualization
Collaboration with cross-functional teams

Education

Bachelor's or Master's degree in Computer Science or Engineering

Tools

Azure Data Factory
Databricks
Airflow
Kafka
Terraform
Job description
About the job Intermediate Data Engineer

Job Purpose

We are seeking a talented and experienced Data Engineer to join our MLOps team which drives critical business applications. As a key member of our team, you will play a crucial role in designing, building, testing, deploying, and monitoring end-to-end data pipelines for both batch and streaming use cases. You will work closely with data scientists, actuaries, software engineers, and other data engineers to contribute to architecting our Client's modern Machine Learning ecosystem.

Areas of responsibility may include but not limited to:

  • Design, build, and maintain ETL pipelines for both batch and streaming use cases.
  • Optimize and refactor existing ETL pipelines to improve efficiency, scalability, and cost-effectiveness.
  • Data visualization and report building.
  • Re-architecting data pipelines for a modern data stack leveraging modern data tools to support actuarial, machine learning, and AI use cases.
  • Utilize expertise in Python and SQL for data pipeline development.
  • Using Linux and shell scripting for system automation.
  • Hands-on experience working with Docker and container orchestration tools is advantageous.
  • Knowledge of Spark is advantageous.

Platforms and Tools:

  • Experience working with ETL tools such as Azure Data Factory, dbt, Airflow, Step Functions, etc.
  • Using Databricks, Kafka and Spark Streaming for big data processing across multiple data sources.
  • Working with both relational and NoSQL databases. Knowledge of and experience with high-performance in-memory databases is advantageous.

DevOps and Automation:

  • Working with Azure DevOps to automate workflows and collaborate with cross-functional teams.
  • Familiarity with Terraform for managing infrastructure as code (IaC) is advantageous.
  • Experience working on other big data platforms could be advantageous.
  • Create and maintain documentation of processes, technologies, and code bases.
  • Collaborate closely with data scientists, actuaries, software engineers, and other data engineers to understand and address their data needs.
  • Contribute actively to the architecture of our Client's modern Machine Learning data ecosystem.

Personal Attributes and Skills

  • Strong proficiency in Python, SQL, and Linux shell scripting.
  • Experience with Spark is advantageous.
  • Previous exposure to ETL tools, relational and NoSQL databases and big data platforms, with experience in Databricks and Azure Data Factory being highly beneficial.
  • Knowledge of DevOps practices and tools, with experience in Azure DevOps being highly beneficial.
  • Familiarity with Terraform for infrastructure automation.
  • Ability to collaborate with cross-functional tech teams as well as business/product teams.
  • Ability to architect data pipelines for advanced analytics use cases.
  • A willingness to embrace a strong DevOps culture.
  • Commitment to excellence and high-quality delivery.
  • Passion for personal development and growth, with a high learning potential.

Education and Experience

  • Bachelor's or Masters degree in Computer Science, Engineering or a related field. Other qualifications will be considered if accompanied by sufficient experience in data engineering.
  • At least 3 years of proven experience as a Data Engineer.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.