Enable job alerts via email!

Data Engineer

Capgemini Engineering

Al Khobar

Hybrid

SAR 150,000 - 200,000

Full time

15 days ago

Job summary

A global engineering services leader is seeking an experienced Data Engineer in Al Khobar, Saudi Arabia. In this role, you will design and optimize scalable data infrastructures, develop ETL workflows and integrate diverse data sources. The ideal candidate has over 5 years of experience, expertise in Apache Spark, and cloud services knowledge. Join a dynamic team with flexible work arrangements and opportunities for growth.

Benefits

Flexible work arrangements

Career growth programs

Access to certifications

Qualifications

5+ years of experience in data engineering and distributed systems.
Experience developing data APIs and working with MLOps tools.
Familiarity with data governance frameworks.

Responsibilities

Design and maintain data pipelines for structured and unstructured data.
Integrate diverse data sources (APIs, databases, streams, flat files).
Ensure compliance with data privacy standards (PII, GDPR, HIPAA).

Skills

Expertise in Apache Spark

Strong skills in SQL

Hands-on experience with cloud services

Proficiency in data formats like Parquet

Experience with Docker

Education

Bachelor’s or Master’s in Computer Science or related field

Tools

Apache Kafka

Airflow

Docker

Get the future you want!

At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where can you make a difference. Where no two days are the same.

Your Role

We are looking for a passionate and experienced Data Engineer to join our growing team. In this role, you will design, build, and optimize scalable data infrastructure that powers intelligent decision-making across industries. You’ll work with cutting-edge technologies to integrate diverse data sources, build real-time and batch pipelines, and ensure data quality, governance, and performance. You’ll collaborate with cross-functional teams to deliver robust, secure, and high-performance data solutions that drive innovation and business value.

Key Responsibilities

Design and maintain data pipelines for structured, semi-structured, and unstructured data
Optimize Apache Spark for distributed processing and scalability
Manage data lakes and implement Delta Lake for ACID compliance and lineage
Integrate diverse data sources (APIs, databases, streams, flat files)
Build real-time streaming pipelines using Apache Kafka
Automate workflows using Airflow and containerize solutions with Docker
Leverage cloud platforms (AWS, Azure, GCP) for scalable infrastructure
Develop ETL workflows to transform raw data into actionable insights
Ensure compliance with data privacy standards (PII, GDPR, HIPAA)
Build APIs to serve processed data to downstream systems
Implement CI/CD pipelines and observability tools (Prometheus, Grafana, Datadog)

Your Profile

Bachelor’s or Master’s in Computer Science, Data Engineering, or related field
5+ years of experience in data engineering and distributed systems
Expertise in Apache Spark and Delta Lake
Hands‑on experience with cloud services (AWS, Azure, GCP)
Strong skills in SQL and NoSQL databases (PostgreSQL, MongoDB, Cassandra)
Proficiency in data formats like Parquet, Avro, JSON, XML
Experience with Airflow, Docker, and CI/CD pipelines
Familiarity with data governance and compliance frameworks
Strong understanding of data quality, lineage, and error handling
Experience developing data APIs and working with MLOps tools

Preferred Skills

Experience with Kubernetes for container orchestration
Knowledge of data warehouses (Snowflake, Redshift, Synapse)
Familiarity with real‑time analytics platforms (Flink, Druid, ClickHouse)
Exposure to machine learning pipelines and IoT data integration
Understanding of graph databases (Neo4j) and data cataloging tools (Apache Atlas, Alation)
Experience with data versioning tools like DVC

What You’ll Love About Working Here

Flexible work arrangements including remote options and flexible hours
Career growth programs and diverse opportunities to help you thrive
Access to certifications in the latest technologies and platforms

About Capgemini

Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organization of over 360,000 team members in more than 50 countries. With its strong 55-year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast-evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering and platforms. The Group reported in 2022 global revenues of €22 billion.

Apply now!

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.