Job Search and Career Advice Platform

Enable job alerts via email!

Data Engineer (PySpark)

Black Pearl Consult

Dubai

On-site

AED 120,000 - 150,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A dynamic consulting firm in Dubai is seeking a Data Engineer to design and maintain scalable data pipelines using Python and Apache Spark. The ideal candidate will have strong proficiency in Python, hands-on experience with PySpark, and a thorough understanding of data engineering practices. Responsibilities include conducting exploratory data analysis, optimizing Spark jobs, and deploying production-grade data solutions. Join an exciting company that offers a competitive salary package and the opportunity to work with modern data engineering tools.

Benefits

Competitive salary package
High exposure to data platforms
Collaborative work environment

Qualifications

  • Strong proficiency in Python.
  • Extensive hands-on experience with Apache Spark (PySpark).
  • Proven experience with Git for version control.
  • Experience with Apache Airflow and/or Jenkins.

Responsibilities

  • Design, develop and maintain robust scalable data pipelines.
  • Conduct Exploratory Data Analysis to identify data patterns.
  • Optimize Spark jobs for performance and cost efficiency.
  • Deploy and tune production-grade data pipelines.

Skills

Python
Apache Spark (PySpark)
SQL
NoSQL databases
Git
Apache Airflow
Jupyter Notebooks
Job description
Key Responsibilities
  • Design develop and maintain robust scalable data pipelines using Python and PySpark
  • Perform data ingestion, transformation, cleansing and validation across structured and unstructured datasets
  • Conduct Exploratory Data Analysis (EDA) to identify data patterns, anomalies and quality issues
  • Apply data imputation techniques, data linking and cleansing to ensure high data quality
  • Implement feature engineering pipelines to support analytics and downstream use cases
  • Optimize Spark jobs for performance, scalability and cost efficiency
  • Deploy and tune production-grade data pipelines ensuring reliability and performance
  • Automate workflows using Apache Airflow and/or Jenkins
  • Collaborate with cross-functional teams to integrate data solutions into production systems
  • Write and maintain unit tests to ensure code quality and reliability
  • Manage source code CI/CD and deployments using Git, GitHub and GitHub Actions
Requirements

To be considered for this role you need to meet the following criteria:

Required Technical Skills
  • Strong proficiency in Python
  • Extensive hands-on experience with Apache Spark (PySpark)
  • Experience working with Jupyter Notebooks
  • Strong knowledge of SQL and NoSQL databases
  • Proven experience with Git for version control and CI/CD
  • Hands-on experience with Apache Airflow and/or Jenkins for scheduling and automation
  • Solid understanding of data engineering best practices in production environments
  • Demonstrated experience in Spark performance tuning and optimization
  • Ability to write clean, testable and maintainable Python code
Mandatory Requirement
  • Previous production experience is a MUST specifically in deploying, tuning and maintaining data pipelines in production environments
Preferred Qualifications
  • Experience working in high-volume or big data environments
  • Strong problem-solving and analytical skills
  • Ability to work independently in a fast-paced environment
Why Join
  • Competitive salary package
  • Opportunity to work on production-scale data platforms
  • Exposure to modern data engineering tools and practices
  • Dubai-based role with a dynamic and collaborative work environment
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.