Enable job alerts via email!

Senior Data Engineer

Washmen

Dubai

On-site

AED 120,000 - 200,000

Full time

Yesterday
Be an early applicant

Job summary

A leading logistics company in Dubai is seeking a Senior Data Engineer to develop and optimize data infrastructure. This role involves architecting scalable data pipelines on AWS and ensuring data quality and performance. Candidates should have expertise in Apache Spark, Databricks, and AWS data services, along with 6-10 years of experience in data engineering.

Benefits

Competitive salary based on experience
Comprehensive health benefits

Qualifications

  • 6-10 years in data engineering with focus on building scalable data platforms.
  • Proven track record architecting and implementing data infrastructure from scratch.
  • Expert in Apache Spark with performance tuning experience.

Responsibilities

  • Design, build, and maintain scalable data pipelines using Spark and Databricks.
  • Architect and manage data infrastructure on AWS.
  • Build and maintain workflow orchestration using Airflow.

Skills

Apache Spark
Databricks
AWS
Data Modeling
Orchestration
SQL
Python

Tools

Airflow
Terraform
Git
Job description

Position Overview

We're seeking a self-sufficient Senior Data Engineer to build and scale our data infrastructure supporting product, engineering and analytics team. You'll architect data pipelines, optimize our data platform, and ensure the teams have reliable, high-quality data to drive business decisions.

This is a hands-on role for someone who can own the entire data engineering stack - from ingestion to transformation to orchestration. You'll work independently to solve complex data challenges and build scalable solutions.

Core Responsibilities

  • Data Pipeline Development & Optimization: Design, build, and maintain scalable data pipelines using Spark and Databricks
  • Develop ETL/ELT workflows to process large volumes of customer behavior data
  • Optimize Spark jobs for performance, cost efficiency, and reliability
  • Build real-time and batch data processing solutions
  • Implement data quality checks and monitoring throughout pipelines
  • Ensure data freshness and SLA compliance for analytics workloads

AWS Data Infrastructure

  • Architect and manage data infrastructure on AWS (S3, Glue, EMR, Redshift)
  • Design and implement data lake architecture with proper partitioning and optimization
  • Configure and optimize AWS Glue for ETL jobs and data cataloging
  • Shifting Glue jobs to Zero ETL
  • Implement security best practices for data access and governance
  • Monitor and optimize cloud costs related to data infrastructure

Data Modeling & Architecture

  • Design and implement dimensional data models for analytics
  • Build star/snowflake schemas optimized for analytical queries
  • Create data marts for specific business domains (retention, campaigns, product)
  • Ensure data model scalability and maintainability
  • Document data lineage, dependencies, and business logic
  • Implement slowly changing dimensions and historical tracking

Orchestration & Automation

  • Build and maintain workflow orchestration using Airflow or similar tools
  • Implement scheduling, monitoring, and alerting for data pipelines
  • Create automated data quality validation frameworks
  • Design retry logic and error handling for production pipelines
  • Build CI/CD pipelines for data workflows
  • Automate infrastructure provisioning using Infrastructure as Code

Cross-Functional Collaboration

  • Partner with Senior Data Analyst to understand analytics requirements
  • Work with Growth Director and team to enable data-driven decision making
  • Support CRM Lead with data needs for campaign execution
  • Collaborate with Product and Engineering on event tracking and instrumentation
  • Document technical specifications and best practices for the team
  • Work closely with all squads, establish data contracts with engineers to land data in a most optimal way

Required Qualifications

  • Must-Have Technical Skills
    • Apache Spark: Expert-level proficiency in PySpark/Spark SQL for large-scale data processing
    • Databricks: Strong hands-on experience building and optimizing pipelines on Databricks platform
    • AWS: Deep knowledge of AWS data services (S3, Glue, EMR, Redshift, Athena)
    • Data Modeling: Proven experience designing dimensional models and data warehouses
    • Orchestration: Strong experience with workflow orchestration tools (Airflow, Prefect, or similar)
    • SQL: Advanced SQL skills for complex queries and optimization
    • Python: Strong programming skills for data engineering tasks
  • Experience
    • 6-10 years in data engineering with focus on building scalable data platforms
    • Proven track record architecting and implementing data infrastructure from scratch
    • Experience processing large volumes of event data (billions of records)
    • Background in high-growth tech companies or consumer-facing products
    • Experience with mobile/web analytics data preferred
  • Technical Requirements
    • Expert in Apache Spark (PySpark and Spark SQL) with performance tuning experience
    • Deep hands-on experience with Databricks (clusters, jobs, notebooks, Delta Lake)
    • Strong AWS expertise: S3, Glue, EMR, Redshift, Athena, Lambda, CloudWatch
    • Proficiency with orchestration tools: Airflow, Prefect, Step Functions, or similar
    • Advanced data modeling skills: dimensional modeling, normalization, denormalization
    • Experience with data formats: Parquet, Avro, ORC, Delta Lake
    • Version control with Git and CI/CD practices
    • Infrastructure as Code: Terraform, CloudFormation, or similar
    • Understanding of data streaming technologies (Kafka, Kinesis) is a plus
  • Core Competencies
    • Self-sufficient: You figure things out independently without constant guidance
    • Problem solver: You diagnose and fix complex data pipeline issues autonomously
    • Performance-focused: You optimize for speed, cost, and reliability
    • Quality-driven: You build robust, maintainable, and well-documented solutions
    • Ownership mindset: You take end-to-end responsibility for your work
    • Collaborative: You work well with analysts and business stakeholders despite being independent
  • Nice-to-Have
    • Databricks certifications (Data Engineer Associate/Professional)
    • Experience with dbt for data transformation
    • Knowledge of customer data platforms (Segment, mParticle, Rudderstack)
    • Experience with event tracking platforms (Mixpanel, Amplitude)
    • Familiarity with machine learning infrastructure and MLOps
    • Experience in MENA region or emerging markets
    • Background in on-demand services, marketplaces, or subscription businesses
    • Knowledge of real-time streaming architectures

    What We Offer

    • Competitive salary based on experience
    • Ownership of critical data infrastructure and architecture decisions
    • Work with modern data stack and cutting-edge AWS technologies
    • Direct impact on business decisions through data platform improvements
    • Comprehensive health benefits
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.