Enable job alerts via email!

Data Engineer

Mi-C3 International Ltd

Pretoria

On-site

ZAR 600 000 - 800 000

Full time

Today
Be an early applicant

Job summary

A data solutions company in Pretoria is seeking a Data Integration Engineer to design and implement data integration solutions for real-time streaming data. The successful candidate will work with Big Data technologies like Apache NiFi and Kafka, develop custom applications using Java and Python, and integrate IoT/IIoT protocols. Strong analytical skills and experience with data quality assurance are essential.

Qualifications

  • Strong experience in designing and implementing data integration solutions for real-time streaming data.
  • Proficiency in using Big Data technologies such as Apache NiFi, Apache Spark, and Kafka.
  • Hands-on experience with IoT/IIoT protocols such as MQTT, SNMP, CoAP, TCP, and WebSockets.

Responsibilities

  • Collaborate with cross-functional teams to design efficient data integration pipelines.
  • Develop components using Java and Python for specific business requirements.
  • Implement ETL processes for streaming data from various sources.

Skills

Data integration design
Java programming
Python programming
Real-time data processing
Big Data technologies
IoT/IIoT protocols
Troubleshooting

Tools

Apache NiFi
Apache Spark
Kafka
RabbitMQ
Job description

As a Data Integration Engineer, you will be responsible for designing, implementing, and maintaining data integration solutions to handle real‑time streaming data from various sources like IoT / IIoT protocols, third‑party APIs, or even raw files.

Your main objective will be to process data in real‑time and provide valuable insights for our organization.

You will work with a diverse range of Big Data tools and technologies.

The successful candidate will have experience in embedded systems bring‑up requirements engineering management, systems integration, developing programs to drive HW and SW planning and articulating the big picture.

Additionally, you will be involved in the development of a Data Streaming platform using Nifi.

Responsibilities
  • Data Integration Design: Collaborate with cross‑functional teams to understand data requirements, source systems, and data formats, and design efficient data integration pipelines for real‑time data streaming from multiple sources.
  • Programming Languages: Develop custom data processing components and applications using Java and Python to meet specific business requirements.
  • ETL Development: Implement Extract, Transform, Load (ETL) processes to ingest and transform data from various streaming sources into a format suitable for analysis and storage.
  • Real‑time Data Processing: Develop and optimize data processing workflows to ensure timely handling of streaming data, maintaining low‑latency and high‑throughput capabilities.
  • Big Data Tools: Utilize and maintain various Big Data tools such as Apache NiFi, Spark, Kafka, etc. to build scalable and robust data integration solutions.
  • Message Broker Configuration: Set up and configure message brokers like RabbitMQ, AMQP, and Kafka to enable efficient data exchange between different systems and applications.
  • IoT / IIoT Protocols Integration: Integrate and work with IoT / IIoT protocols such as MQTT, SNMP, CoAP, TCP, and WebSockets to capture data from edge devices and industrial systems.
  • Data Quality and Validation: Implement data validation checks and data quality measures to ensure the accuracy and reliability of the integrated data.
  • Performance Monitoring: Monitor the performance and health of data integration pipelines, making necessary adjustments to optimize data flow and resource utilization.
  • Troubleshooting and Issue Resolution: Diagnose and resolve issues related to data integration, ensuring smooth and uninterrupted data streaming.
Technical Requirements
  • Strong experience in designing and implementing data integration solutions for real‑time streaming data.
  • Proficiency in using Big Data technologies such as Apache NiFi, Apache Spark and Kafka.
  • Familiarity with message brokers like RabbitMQ, AMQP, and Kafka for data exchange and event‑driven architectures.
  • Hands‑on experience with IoT / IIoT protocols such as MQTT, SNMP, CoAP, TCP, and WebSockets.
  • Proficiency in programming languages such as Java and Python for developing custom data processing components.
  • Knowledge of data quality assurance and validation techniques to ensure reliable data.
  • Ability to troubleshoot and resolve issues related to data integration and streaming processes.
  • Strong analytical and problem‑solving skills, with a keen eye for detail.
  • Excellent communication and teamwork skills to collaborate effectively with cross‑functional teams.
  • Experience with cloud‑based platforms and distributed systems is advantageous.
  • Have a never ending curious mindset to learn and work with new tools and technologies.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.