Role purpose
The Data Engineer is responsible for designing, building, and maintaining scalable data pipelines, integrations, and data warehouse structures that support organizational analytics and reporting. The role ensures high-quality, reliable, and secure data flow across systems, enabling business units to make data-driven decisions aligned with Kun Sports’ strategic objectives.
Key Accountabilities & Activities
Data Pipeline Development & Management
- Design, build, and maintain scalable ETL/ELT pipelines for ingestion, cleaning, transformation, and loading of structured and unstructured data.
- Build workflows using Python frameworks such as FastAPI, Redis, RabbitMQ, etc.
- Automate and orchestrate data workflows using Airflow, DBT, Prefect, or cloud-native schedulers.
- Implement automated job monitoring, error logging, and alerting mechanisms.
- Optimize pipeline performance to reduce latency, processing time, and resource consumption.
- Develop real-time streaming pipelines using Kafka, Kinesis, or similar technologies.
Data Warehouse & Database Architecture
- Build and maintain enterprise data warehouse schemas (star, snowflake, normalized).
- Model data structures to support dashboards, analytics, and predictive models.
- Set up database indexing, partitioning, and clustering to enhance query performance.
- Maintain metadata repositories, data catalogs, and lineage tracking.
Data Quality, Governance & Compliance
- Establish validation rules, anomaly detection, and data quality scoring frameworks.
- Ensure compliance with internal data governance, security policies, and data retention rules.
- Implement RBAC (role-based access control) and least-privilege access policies for all environments.
- Monitor and maintain data accuracy, completeness, timeliness, and integrity.
- Document data sources, transformation logic, and quality standards.
System Integration & API Development
- Develop and maintain robust integrations with HRMS (Menaitech), CRM, ERP, POS, Sentinel, fitness systems, and external vendors.
- Build and maintain RESTful APIs and data endpoints for cross-department usage.
- Troubleshoot API failures, authentication issues, and versioning conflicts.
- Create scalable integration architectures for future system expansion.
Analytics, Reporting & Business Support
- Enable BI team with optimized datasets for Power BI, Tableau, Looker, or internal dashboards.
- Support advanced analytics, machine learning pipelines, and predictive modeling.
- Translate business and department requirements into data architecture solutions.
- Build reusable data models for Finance, HR, Operations, Sales, and Marketing.
- Provide deep-dive analysis and root cause investigations for data inconsistencies.
Cloud Environment & Infrastructure
- Manage cloud storage, compute instances, networking, and access permissions (AWS/Azure/GCP).
- Set up CI/CD pipelines for data engineering deployments.
- Manage Terraform / IaC for provisioning infrastructure.
- Ensure cost optimization for cloud-based workloads and data storage.
Maintenance, Documentation & Continuous Improvement
- Maintain detailed technical documentation for data pipelines, schemas, integration flows, APIs, and architecture.
- Conduct regular system maintenance, patch updates, and optimization.
- Create runbooks for troubleshooting common issues and recovery procedures.
- Conduct capacity planning and scalability assessments.
- Recommend and implement new tools, technologies, and best practices.
Requirements for Role
Experience & qualifications
- Bachelor’s Degree in Computer Science, Information Systems, Data Engineering, or related field.
- 4–6 years of experience in data engineering, data warehousing, or backend data infrastructure.
Knowledge & skills
- Strong SQL proficiency and experience with relational and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB).
- Hands‑on experience with ETL tools, data orchestration (Airflow, AWS Glue, DBT, etc.).
- Experience with cloud data environments (AWS, Azure, or GCP).
- Strong proficiency in Python.
- Solid understanding of APIs, data modeling, and data architecture principles.
- Experience with BI tools connectivity (Power BI, Tableau, Looker).
- Strong troubleshooting, debugging, and problem‑solving skills.
- Strong documentation, communication, and stakeholder management skills.