You will be designing, developing, testing, and documenting the data collection framework.
The data collection consists of (complex) data pipelines from (IoT) sensors and low/high level control components to our Data Science platform.
You will build a monitoring solution of the data pipeline which enables data quality improvement.
You will develop scalable data pipelines to transform and aggregate data for business use, following software engineering best practices.
For these data pipelines, you will make use of the best frameworks available for data processing like Spark and Splunk.
You develop our data services for customer sites towards a product, using (test & deployment) automation, componentization, templates, and standardization in order to reduce delivery time of our projects for customers.
The product provides insights into the performance of our material handling systems at customers all around the globe.
You design and build a CI/CD pipeline, including (integration) test automation for data pipelines.
In this process, you strive for an ever-increasing degree of automation.
You will work with an infrastructure engineer to extend storage capabilities and types of data collection (e.g. streaming).
You have experience in developing APIs.
You will coach and train the junior data engineer with state-of-the-art big data technologies.
What do we expect from you?
Bachelor's or Master's degree in computer science, IT, or equivalent with at least 7 years of relevant work experience.
Programming in Python/Scala/Java.
Experience with scalable data processing frameworks (e.g., Spark).
Familiarity with event processing tools like Splunk or the ELK stack.
Experience in deploying services as containers (e.g., Docker and Kubernetes).
Knowledge of streaming and/or batch storage (e.g., Kafka, Oracle).
Experience working with cloud services (preferably with Azure).