The Data Engineer is responsible for designing, building, and maintaining scalable, reliable, and high-quality data pipelines and platforms that enable analytics, reporting, and data-driven decision-making. The role focuses on transforming raw data into trusted, accessible datasets while ensuring performance, security, and operational excellence across the data ecosystem.
Key Roles & Responsibilities
- Design, develop, and maintain scalable, reliable data pipelines for batch and real-time processing
- Ingest, transform, and curate data from multiple internal and external sources
- Build and optimize data models and datasets for analytics, reporting, and downstream consumption
- Ensure data quality, completeness, and accuracy through validation, monitoring, and reconciliation checks
- Implement and maintain data orchestration, scheduling, and automation workflows
- Optimize data processing performance and cloud resource utilization
- Collaborate with data architects to align implementations with enterprise data architecture standards
- Work closely with analysts, data scientists, and business teams to understand data requirements
- Support BI, analytics, and AI/ML use cases by delivering well-documented and trusted datasets
- Implement data security, access controls, and privacy requirements within data pipelines
- Troubleshoot and resolve data pipeline failures and performance issues
- Contribute to DevOps and CI/CD practices for data solutions
- Document data pipelines, transformations, and operational procedures
- Participate in code reviews and promote data engineering best practices
Qualifications & Experience
- Bachelor’s degree in Computer Science, Engineering, Information Systems, Data Science, or a related field
- Master’s degree is an advantage but not mandatory
- 8+ years of experience in data engineering, analytics engineering, or backend engineering roles
- Strong hands-on experience building and maintaining ETL/ELT pipelines
- Proven experience working with Relational and NoSQL databases, Data warehouses and data lakes, Structured, semi-structured, and unstructured data, Experience with cloud data platforms (e.g., Azure, AWS, GCP)