Enable job alerts via email!

Data Engineer

Deeplight AI

Abu Dhabi

On-site

AED 120,000 - 200,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A specialist AI consultancy is seeking a skilled data solutions architect to design and optimize scalable data systems. The successful candidate will work with Data Lakehouse architectures and play a vital role in automating the machine learning lifecycle while ensuring high data quality. This position offers competitive salary, comprehensive health insurance, and numerous opportunities for professional growth in AI projects.

Benefits

Competitive salary and performance bonuses

Comprehensive health insurance

Professional development and certification support

Flexible working arrangements

Career advancement opportunities

Qualifications

Proven experience with Data Lakehouse architectures.
Strong experience with big data technologies like Spark and Kafka.
Experience in implementing MLOps pipelines for production systems.

Responsibilities

Design and optimize scalable data solutions using Lakehouse architecture.
Maintain data pipelines for diverse datasets and ensure high data quality.
Manage monitoring systems to track model performance metrics.

Skills

Data Lakehouse architectures

Data ingestion

ETL/ELT processes

MLOps principles

Cloud platforms (AWS, Azure, GCP)

SQL

Python

Analytical skills

Tools

Databricks

Delta Lake

Spark

Kafka

Docker

Kubernetes

MLflow

Overview

DeepLight AI is a specialist AI and data consultancy with extensive experience implementing intelligent enterprise systems across multiple industries, with particular depth in financial services and banking. Our team combines deep expertise in data science, statistical modeling, AI/ML technologies, workflow automation, and systems integration with a practical understanding of complex business operations.

Responsibilities

Design, build, and optimise scalable data solutions, primarily utilising the Lakehouse architecture to unify data warehousing and data lake capabilities. Advise stakeholders on the strategic choice between Data Warehouse, Data Lake, and Lakehouse architectures based on specific business needs, cost, and latency requirements.
Design, develop, and maintain scalable and reliable data pipelines to ingest, transform, and load diverse datasets from various sources, including structured and unstructured data, streaming data, and real-time feeds.
Implement standards and tooling to ensure ACID properties, schema evolution, and high data quality within the Lakehouse environment. Implement robust data governance frameworks (security, privacy, integrity, compliance, auditing).
Continuously optimize data storage, compute resources, and query performance across the data platform to reduce costs and improve latency for both BI and ML workloads, leveraging techniques such as indexing, partitioning, and parallel processing.
Develop and maintain CI/CD pipelines to automate the entire machine learning lifecycle, from data validation and model training to deployment and infrastructure provisioning.
Deploy, manage, and scale machine learning models into production environments, utilizing MLOps principles for reliable and repeatable operations.
Establish and manage monitoring systems to track model performance metrics, detect data drift (changes in input data), and model decay (degradation in prediction accuracy).
Ensure rigorous version control and tracking for all components: code, datasets, and trained model artifacts (using tools like MLflow or similar).
Create comprehensive documentation, including technical specifications, data flow diagrams, and operational procedures, to facilitate understanding, collaboration, and knowledge sharing.

Requirements

Proven practical experience in designing, building, and optimising solutions using Data Lakehouse architectures (e.g., Databricks, Delta Lake).
Strong hands-on experience with managing data ingestion, schema enforcement, ACID properties, and utilizing big data technologies/frameworks like Spark and Kafka.
Expertise in data modeling, ETL/ELT processes, and data warehousing concepts. Proficiency in SQL and scripting languages (e.g., Python, Scala).
Demonstrated practical experience implementing MLOps pipelines for production systems. This includes a solid understanding and implementation experience with MLOps principles: automation, governance, and monitoring of ML models throughout the entire lifecycle.
Experience with CI/CD tools, containerization/orchestration technologies (e.g., Docker, Kubernetes), model serving frameworks (e.g., TensorFlow Serving, Sagemaker), and experiment tracking (e.g., MLflow).
Experience with production monitoring tools to detect data drift or model decay.
Strong hands-on experience with major cloud platforms (e.g., AWS, Azure, GCP) and familiarity with DevOps practices.
Excellent analytical, problem-solving, and communication skills, with the ability to translate complex technical concepts into clear and actionable insights.
Proven ability to work effectively in a fast-paced, collaborative environment, with a passion for innovation and continuous learning.

Benefits

Competitive salary and performance bonuses
Comprehensive health insurance
Professional development and certification support
Opportunity to work on cutting-edge AI projects
Flexible working arrangements
Career advancement opportunities in a rapidly growing AI company

This position offers a unique opportunity to shape the future of AI implementation while working with a talented team of professionals at the forefront of technological innovation. The successful candidate will play a crucial role in driving our company's success in delivering transformative AI solutions to our clients.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions