We are looking for a Data Engineer to design and implement large-scale, real-time data architectures that power AI, predictive maintenance, and industrial automation in mission-critical environments. This role will ensure high availability, accuracy, and efficiency in processing real-time industrial data.
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
- 5+ years of experience in real-time data engineering, cloud computing, or industrial IoT.
- Strong expertise in data streaming, event-driven processing, and distributed computing frameworks.
- Hands-on experience with best practices with Big Data using Kafka, Apache Flink, Spark Streaming, or similar technologies.
- Deep knowledge of ETL design, data warehousing, and real-time analytics.
- Experience with SQL and NoSQL databases, including TimescaleDB, InfluxDB, Cassandra, or MongoDB.
- Proficiency in Python, Scala, or Java for data processing and automation.
- Understanding of IoT protocols (MQTT, OPC-UA, Modbus) and industrial data standards.
- Familiarity with DevOps and MLOps best practices, including CI/CD pipelines for data workflows.
- Knowledge of cloud-native data solutions (AWS Kinesis, Google Pub/Sub, Azure Event Hub).
- Familiarity with workflow orchestration platforms like Apache Airflow.
Preferred Qualifications:
- Experience in mining, oil & gas, or large-scale industrial automation projects.
- Knowledge of machine learning model deployment in real-time production environments.
- Understanding of GIS data processing, geospatial analytics, and digital twin integrations.
- Experience with cybersecurity frameworks for industrial data environments.
Soft Skills:
- Strong problem-solving mindset, capable of optimizing large-scale data architectures.
- Ability to collaborate across teams, including AI, cloud, and industrial operations.
- Effective communication skills for translating technical data insights into actionable business strategies.
- Passion for real-time data processing, AI-driven automation, and industrial innovation.
Responsibilities:
- Design and implement real-time data pipelines for ingesting, processing, and visualizing industrial sensor data.
- Develop low-latency, high-throughput streaming architectures using Kafka, Apache Flink, or Spark Streaming.
- Build and optimize secure, scalable data lakes and ETL pipelines for mining, oil & gas, and heavy industry applications.
- Implement event-driven architectures and data transformation workflows for AI-powered automation.
- Ensure data governance, security, and compliance with industrial cybersecurity standards (IEC 62443, ISO 27001, GDPR).
- Work with AI engineers, cloud architects, and automation teams to enable real-time AI decision-making.
- Optimize time-series data processing using TimescaleDB, InfluxDB, or OpenTSDB.
- Deploy cloud-based data solutions with AWS Kinesis, Google Pub/Sub, Snowflake, or Grafana.
- Monitor data pipeline performance with modern observability frameworks (e.g. Prometheus), ensuring fault tolerance, scalability, versioning, and efficiency.
- Collaborate with industrial automation and IoT teams to integrate edge-to-cloud data flows.