Job Search and Career Advice Platform

Enable job alerts via email!

Data Engineer

Talent Bridge HR Consultancy Dubai

Dubai

On-site

AED 120,000 - 200,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading consultancy firm in Dubai is seeking a candidate to build and manage robust observable data pipelines essential for AI and ML workflows. The ideal candidate should have strong programming skills in C or Java, extensive knowledge of SQL/NoSQL, and experience with tools like AWS SageMaker and Docker. The role demands high-quality data processes and cloud competency, contributing significantly towards efficient data utilization for research and production purposes.

Qualifications

  • Skills in programming with C or Java, along with SQL and NoSQL databases.
  • Experience with data manipulation using Pandas/NumPy and building pipelines with PySpark.
  • Familiarity with Airflow for workflow management and Docker for containerization.

Responsibilities

  • Architect and operate batch/stream data pipelines for ETL/ELT.
  • Model and manage data schemas, ensuring quality and governance.
  • Support ML workflows using DVC and MLflow or Weights & Biases.

Skills

Programming in C or Java
SQL & NoSQL
Pandas/NumPy
PySpark
Airflow
API development
Docker

Education

Proven experience building reliable pipelines in production

Tools

AWS SageMaker
Azure ML
GCP AI
DVC
MLflow
Weights & Biases
Job description
Role Summary

Build robust observable data pipelines that power research and production AI. Success means high pipeline reliability (on-time SLAs) strong data quality (validation & lineagen) and enabling fast experimentation. You will partner with AI/ML analytics and product to make data trustworthy and usable.

Responsibilities
  • Architect and operate batch/stream pipelines (Airflow; Spark optional) for ETL/ELT.
  • Model/manage schemas; enforce data quality and lineage/governance.
  • Support ML workflows with DVC (data versioning) and MLflow or Weights & Biases.
  • Build feature stores/data services; expose datasets via secure REST endpoints.
  • Optimize performance/cost across storage/compute; implement monitoring/alerting.
  • Maintain documentation and internal catalogs; enable self-service analytics.
Qualifications
  • Skills: Programming in C or Java; SQL & NoSQL; Pandas/NumPy; PySpark; Airflow; API development; Docker.
  • MLOps: DVC; MLflow or W&b; model packaging/deployment fundamentals.
  • Cloud: AWS SageMaker Azure ML or GCP AI experience.
  • Nice to have: Unreal Engine exposure.
  • Environment: Solid Linux background for development and deployment.
  • Education/Experience: Proven experience building reliable pipelines in production.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.