Enable job alerts via email!

Databricks Data Engineer

Avanade

Kuala Lumpur

On-site

MYR 80,000 - 100,000

Full time

3 days ago

Be an early applicant

Job summary

A leading technology consultancy is seeking a Databricks Data Engineer in Kuala Lumpur, Malaysia. The role involves designing, developing, and maintaining scalable data pipelines using Databricks and Apache Spark, ensuring data quality and governance. The ideal candidate will have over 5 years of data engineering experience and expertise in Python and SQL. Join a collaborative team focused on delivering high-performance data solutions.

Qualifications

5+ years of hands-on experience in Data Engineering.
3+ years of building solutions on the Databricks Lakehouse Platform.
Strong understanding of distributed computing principles.

Responsibilities

Design and maintain scalable ETL/ELT pipelines using Databricks.
Utilize Delta Lake and Databricks features for data management.
Collaborate with stakeholders to translate data requirements.

Skills

Data Engineering

Databricks

Python (PySpark)

SQL

Apache Spark

Data Warehousing

ETL/ELT Processes

Cloud Platforms (AWS, Azure, GCP)

The Databricks Data Engineer will be responsible for the design, development, and maintenance of scalable and high-performance data pipelines within the Databricks Lakehouse Platform. This role involves using Apache Spark, Delta Lake, and various Databricks services to process large volumes of batch and streaming data, ensuring data quality, reliability, and accessibility for data consumers.

Key Responsibilities

Data Pipeline Development: Design, build, and maintain robust and scalable ETL/ELT pipelines using Databricks, PySpark/Scala, and SQL to ingest, transform, and load data from diverse sources (e.g., databases, APIs, streaming services) into Delta Lake.
Databricks Ecosystem Utilization: Utilize core Databricks features such as Delta Lake, Databricks Workflows (or Jobs), Databricks SQL, and Unity Catalog for pipeline orchestration, data management, and governance.
Performance Optimization: Tune and optimize Spark jobs and Databricks clusters for maximum efficiency, performance, and cost-effectiveness.
Data Quality and Governance: Implement data quality checks, validation rules, and observability frameworks. Adhere to data governance policies and leverage Unity Catalog for fine-grained access control.
Collaboration: Work closely with Data Scientists, Data Analysts, and business stakeholders to translate data requirements into technical solutions and ensure data is structured to support analytics and machine learning use cases.
Automation & DevOps: Implement CI/CD and DataOps principles for automated deployment, testing, and monitoring of data solutions.
Documentation: Create and maintain technical documentation for data pipelines, data models, and processes.
Troubleshooting: Monitor production pipelines, troubleshoot complex issues, and perform root cause analysis to ensure system reliability and stability.

Qualifications

Required Skills & Experience

5+ years of hands‑on experience in Data Engineering.
3+ years of dedicated experience building solutions on the Databricks Lakehouse Platform.
Expert proficiency in Python (PySpark) and SQL for data manipulation and transformation.
In-depth knowledge of Apache Spark and distributed computing principles.
Experience with Delta Lake and Lakehouse architecture.
Strong understanding of ETL/ELT processes, data warehousing, and data modeling concepts.
Familiarity with at least one major cloud platform (AWS, Azure, or GCP) and its relevant data services.

Preferred Skills & Certifications

Experience with Databricks features like Delta Live Tables (DLT), Databricks Workflows, and Unity Catalog.
Experience with streaming technologies (e.g., Kafka, Spark Streaming).
Familiarity with CI/CD tools and Infrastructure-as-Code (e.g., Terraform, Databricks Asset Bundles).
Databricks Certified Data Engineer Associate or Professional certification.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.