Job Search and Career Advice Platform

Enable job alerts via email!

Senior PySpark Data Engineer | CDP ETL & Data Lakes

VirtusaPolaris - Virtusa Corporation

Dubai

On-site

AED 120,000 - 200,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Dubai seeks an experienced Data Engineer to design and develop scalable ETL pipelines using PySpark on the Cloudera Data Platform. The ideal candidate has 8+ years of experience, advanced proficiency in data ingestion and transformation, and strong skills in performance optimization and data quality. Knowledge of orchestration tools like Apache Oozie and Airflow, as well as scripting skills in Linux, are also important. This role offers a collaborative environment with opportunities to work on diverse data-driven initiatives.

Qualifications

  • 8+ years of experience as a Data Engineer, focusing on PySpark and Cloudera Data Platform.
  • Advanced proficiency in working with RDDs and DataFrames in PySpark.
  • Strong experience with Cloudera components including Cloudera Manager, Hive, and Impala.

Responsibilities

  • Design and develop ETL pipelines using PySpark on Cloudera Data Platform.
  • Implement and manage data ingestion from various sources to the data lake.
  • Conduct performance tuning of PySpark code and optimize Cloudera components.

Skills

PySpark
Data Engineering
Cloudera Data Platform
SQL
Data Warehousing

Education

Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or related field

Tools

Apache Oozie
Airflow
Job description
A leading technology company in Dubai seeks an experienced Data Engineer to design and develop scalable ETL pipelines using PySpark on the Cloudera Data Platform. The ideal candidate has 8+ years of experience, advanced proficiency in data ingestion and transformation, and strong skills in performance optimization and data quality. Knowledge of orchestration tools like Apache Oozie and Airflow, as well as scripting skills in Linux, are also important. This role offers a collaborative environment with opportunities to work on diverse data-driven initiatives.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.