Job Description
The Data Engineer will be responsible for designing, developing, and maintaining our centralised data repository, as well as all data streams in and out of it. They will provide strategic and technical guidance for all measurement, digital transformation and innovation.
They are responsible for leading the Department of Programs and Technical Services (DPTS) into the production and availability of quality basic and disaggregated data and the strengthening of information systems and digital transformation for health as a fundamental pillar. They will guide the data analysts, data scientists, and Strategic Information (SI) team through the ethical use of data to produce health intelligence through analysis, modelling, forecasting, and data science to guide decisions and actions in m2m’s programmes.
This position is based at m2m’s Head Office in Cape Town and is part of the Data Analytics Unit (a sub-unit within DPTS) and reports to the Data Analytics Lead.
Key Performance Areas
Design and build data pipelines for multiple health information systems including CommCare, DHIS2, VMMP, and CHARM.
- Design and build data pipelines for multiple health information systems including CommCare, DHIS2, VMMP, and CHARM.
- Implement robust ETL processes for data extraction, transformation, and loading.
- Ensure all client-level data is stored in GCP in clean, structured, SQL-query able formats.
- Automate pipeline processes to minimize manual intervention and improve reliability. Monitor pipeline performance and troubleshoot issues as they arise.
Cloud Infrastructure Management & Optimization
- Review and optimize GCP environment architecture for cost-effectiveness and performance.
- Configure and maintain user roles, permissions, and security protocols within cloud platforms.
- Implement cloud data storage, processing, and backup strategies. Monitor resource utilization and implement cost optimization measures.
- Ensure compliance with data security and privacy requirements.
System Integration & Automation
- Develop and maintain scripts for seamless data integration between multiple systems.
- Create and optimize data integration workflows between various health information platforms.
- Ensure data consistency, quality, and integrity across integrated systems.
- Troubleshoot integration issues and implement corrective measures.
- Build applications that facilitate server / client communication via network protocols.
Database Management & Reporting Support
- Create and maintain databases for clean client-level and aggregated data.
- Automate data loading processes and trait generation.
- Support business intelligence tool implementation and reporting process optimization.
- Collaborate on dashboard development and reporting enhancements.
- Ensure database performance optimization and maintenance using SQL and related technologies.
Documentation & Knowledge Management
- Develop comprehensive technical documentation for all data processes, scripts, and workflows.
- Create and maintain ETL pipeline architecture documentation and transformation logic.
- Document automation procedures, scheduling, and maintenance requirements.
- Maintain cloud environment configuration and troubleshooting guides. Provide technical training and knowledge transfer to team members.