Enable job alerts via email!

Intermediate - Senior Data Engineer (Databricks)

Aica Consultancy

Gauteng

On-site

ZAR 40 000 - 80 000

Full time

6 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative consultancy is seeking a skilled Data Engineer to design and optimize data pipelines using Databricks. In this role, you will develop scalable ETL/ELT workflows, ensuring data quality and performance while collaborating with various stakeholders. This position offers the chance to work with cutting-edge technologies like Delta Lake and real-time streaming solutions, making a significant impact on data-driven decision-making. Join a forward-thinking team where your expertise in cloud platforms and data governance will be valued, and help shape the future of data management.

Qualifications

Certified in Databricks with strong experience in ETL/ELT development.
Proficiency in Python and SQL, familiar with real-time data processing.

Responsibilities

Design and optimize scalable data pipelines on Databricks.
Implement real-time streaming solutions and document best practices.

Skills

Databricks

Spark

Delta Lake

Python

SQL

ETL / ELT Development

Real-Time Data Processing

Cloud Platforms (AWS, Azure, GCP)

Data Governance

Unity Catalog

We are looking for a Data Engineer who is certified in Databricks (required) to join our team.

In this role, you will be designing, developing, and optimizing scalable data pipelines and workflows on Databricks.

The engineer will work closely with stakeholders to ensure data reliability, performance, and alignment with business requirements.

Scope of Work

Data Pipeline Development: Building efficient ETL / ELT pipelines using Databricks and Delta Lake for structured, semi-structured, and unstructured data. Transforming raw data into consumable datasets for analytics and machine learning.
Data Optimization: Improving performance by implementing best practices like partitioning, caching, and Delta Lake optimizations. Resolving bottlenecks and ensuring scalability.
Data Integration: Integrating data from various sources such as APIs, databases, and cloud storage systems (e.g., AWS S3, Azure Data Lake).
Real-Time Streaming: Designing and deploying real-time data streaming solutions using Databricks Structured Streaming.
Data Quality and Governance: Implementing data validation, schema enforcement, and monitoring to ensure high-quality data delivery. Using Unity Catalog to manage metadata, access permissions, and data lineage.
Collaboration and Documentation: Collaborating with data analysts, data scientists, and other stakeholders to meet business needs. Documenting pipelines, workflows, and technical solutions.

Responsibilities

Developing fully functional and documented data pipelines.
Creating optimized and scalable data workflows on Databricks.
Implementing real-time streaming solutions integrated with downstream systems.
Providing detailed documentation for implemented solutions and best practices.

Skills and Qualifications

Proficiency in Databricks (certified), Spark, and Delta Lake.
Strong experience with Python, SQL, and ETL / ELT development.
Familiarity with real-time data processing and streaming.
Knowledge of cloud platforms (e.g., AWS, Azure, GCP).
Experience with data governance and tools like Unity Catalog.

Assumptions

Access to necessary datasets and cloud infrastructure will be provided.
Timely input and feedback from stakeholders.

Success Metrics

Data pipelines deliver accurate and consistent data.
Workflows meet performance benchmarks.
Real-time streaming solutions operate with minimal latency.
Stakeholders are satisfied with the quality and usability of the solutions.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Intermediate - Senior Data Engineer (Databricks)

Aica Consultancy

Gauteng

On-site

ZAR 40 000 - 80 000

Full time

Job summary

Qualifications

Responsibilities

Skills

Job description