Enable job alerts via email!

Data Engineer

Talent Bridge HR Consultancy Dubai

Dubai

On-site

AED 120,000 - 200,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading consultancy firm in Dubai is seeking a candidate to build and manage robust observable data pipelines essential for AI and ML workflows. The ideal candidate should have strong programming skills in C or Java, extensive knowledge of SQL/NoSQL, and experience with tools like AWS SageMaker and Docker. The role demands high-quality data processes and cloud competency, contributing significantly towards efficient data utilization for research and production purposes.

Qualifications

Skills in programming with C or Java, along with SQL and NoSQL databases.
Experience with data manipulation using Pandas/NumPy and building pipelines with PySpark.
Familiarity with Airflow for workflow management and Docker for containerization.

Responsibilities

Architect and operate batch/stream data pipelines for ETL/ELT.
Model and manage data schemas, ensuring quality and governance.
Support ML workflows using DVC and MLflow or Weights & Biases.

Skills

Programming in C or Java

SQL & NoSQL

Pandas/NumPy

PySpark

Airflow

API development

Docker

Education

Proven experience building reliable pipelines in production

Tools

AWS SageMaker

Azure ML

GCP AI

DVC

MLflow

Weights & Biases

Role Summary

Build robust observable data pipelines that power research and production AI. Success means high pipeline reliability (on-time SLAs) strong data quality (validation & lineagen) and enabling fast experimentation. You will partner with AI/ML analytics and product to make data trustworthy and usable.

Responsibilities

Architect and operate batch/stream pipelines (Airflow; Spark optional) for ETL/ELT.
Model/manage schemas; enforce data quality and lineage/governance.
Support ML workflows with DVC (data versioning) and MLflow or Weights & Biases.
Build feature stores/data services; expose datasets via secure REST endpoints.
Optimize performance/cost across storage/compute; implement monitoring/alerting.
Maintain documentation and internal catalogs; enable self-service analytics.

Qualifications

Skills: Programming in C or Java; SQL & NoSQL; Pandas/NumPy; PySpark; Airflow; API development; Docker.
MLOps: DVC; MLflow or W&b; model packaging/deployment fundamentals.
Cloud: AWS SageMaker Azure ML or GCP AI experience.
Nice to have: Unreal Engine exposure.
Environment: Solid Linux background for development and deployment.
Education/Experience: Proven experience building reliable pipelines in production.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs