Enable job alerts via email!

Sr. Data Engineer – Industry 4.0

Cognizant

Greater London

On-site

GBP 70,000 - 90,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology consultancy is seeking a Senior Data Engineer to develop scalable data platforms for Industry 4.0 initiatives. This role involves integrating data across OT and IT systems, managing data lakes, and ensuring robust governance frameworks. The ideal candidate will have strong experience in AWS or Azure platforms, proficiency in Python and SQL, and a solid understanding of data governance and quality. Join to support innovative AI/ML projects in manufacturing environments.

Qualifications

Cloud experience with AWS and/or Azure, particularly with data services.
Strong programming skills in Python and SQL for data manipulation.
Expertise in ETL/ELT processes and data streaming technologies.

Responsibilities

Architect and implement cloud-native data pipelines on AWS or Azure.
Integrate data from OT and IT systems using various communication protocols.
Design and manage data lakes, warehouses, and analytics platforms.
Implement governance policies and ensure data quality compliance.

Skills

Cloud Platforms: Deep experience with AWS and/or Azure

Proficiency in Python, SQL, PySpark

Expertise in ETL/ELT and Streaming technologies

Familiarity with OT data schema

Experience in defining semantic layers

Hands-on experience with data governance frameworks

Ability to implement data quality frameworks

Knowledge of data privacy and access control

Defining and creating MLOps pipelines

Tools

Apache Airflow

Kafka

Informatica

Power BI

Terraform

JD: Sr. Data Engineer – Industry 4.0

We are hiring a senior Data Engineer to lead the development of intelligent, scalable data platforms for Industry 4.0 initiatives. This role will drive integration across OT/IT systems, enable real-time analytics, and ensure robust data governance and quality frameworks. The engineer will collaborate with cross-functional teams to support AI/ML, GenAI, and IIoT use cases in manufacturing and industrial environments.

Key Responsibilities

Architect and implement cloud-native data pipelines on AWS or Azure for ingesting, transforming, and storing industrial data.
Integrate data coming from OT systems (SCADA, PLC, MES, Historian) and IT systems (ERP, CRM, LIMS) using protocols like OPC UA, MQTT, REST.
Design and manage data lakes, warehouses, and streaming platforms for predictive analytics, digital twins, and operational intelligence.
Define and maintain asset hierarchies, semantic models, and metadata frameworks for contextualized industrial data.
Implement CI/CD pipelines for data workflows and ensure lineage, observability, and compliance across environments.
Collaborate with AI/ML teams to support model training, deployment, and monitoring using MLOps frameworks.
Establish and enforce data governance policies, stewardship models, and metadata management practices.
Monitor and improve data quality using rule‑based profiling, anomaly detection, and GenAI‑powered automation.
Support GenAI initiatives through data readiness, synthetic data generation, and prompt engineering.

Mandatory Skills

Cloud Platforms: Deep experience with AWS (S3, Lambda, Glue, Redshift) and/or Azure (Data Lake, Synapse).
Programming & Scripting: Proficiency in Python, SQL, PySpark, etc.
ETL/ELT & Streaming: Expertise in technologies like Apache Airflow, Glue, Kafka, Informatica, EventBridge, etc.
Industrial Data Integration: Familiarity with OT data schema originating from OSIsoft PI, SCADA, MES, and Historian systems.
Information Modeling: Experience in defining semantic layers, asset hierarchies, and contextual models.
Data Governance: Hands‑on experience with data governance frameworks.
Data Quality: Ability to implement profiling, cleansing, standardization, and anomaly detection frameworks.
Security & Compliance: Knowledge of data privacy, access control, and secure data exchange protocols.
Defining and creating MLOPs pipelines.

Good to Have Skills

GenAI Exposure: Experience with LLMs, LangChain, HuggingFace, synthetic data generation, and prompt engineering.
Digital Twin Integration: Familiarity with nVidia Omniverse, AWS TwinMaker, Azure Digital Twin or similar platforms and concepts.
Visualization Tools: Power BI, Grafana, or custom dashboards for operational insights.
DevOps & Automation: CI/CD tools (Jenkins, GitHub Actions), infrastructure‑as‑code (Terraform, CloudFormation).
Industry Standards: ISA‑95, Unified Namespace (UNS), FAIR data principles, and DataOps methodologies.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.