Architect and implement robust data integration pipelines to extract, transform, and load data from various sources (e.g., databases, SaaS applications, APIs, and flat files) into a centralized data platform.
Design and develop complex ETL (Extract, Transform, Load) processes to ensure data quality, consistency, and reliability.
Optimize data transformation workflows to improve performance and scalability.
Implement and maintain data ingestion, processing, and storage solutions to support the organization's data and analytics requirements.
Ensure the reliability, security, and availability of the data infrastructure through effective monitoring, troubleshooting, and disaster recovery planning.
Collaborate with the data governance team to establish data policies, standards, and procedures.
Develop and maintain a comprehensive metadata management system to ensure data lineage, provenance, and traceability.
Implement data quality control measures and data validation processes to ensure the integrity and reliability of the data.
Desired Candidate Profile
5-6 years of experience as a Data Engineer or a related role in a data-driven organization.
Proficient in designing and implementing data integration and ETL pipelines using tools such as Apache Airflow, airbyte, or any cloud-based data integration services.
Strong experience in setting up and managing data infrastructure, including data lakes, data warehouses, and real-time streaming platforms (e.g. Elastic, Google Bigquery, Mongodb).
Expertise in data modeling, data quality management, and metadata management.
Proficient in programming languages such as Python or Java, and experience with SQL.
Familiarity with cloud computing platforms (e.g., AWS, Google Cloud) and DevOps practices.
Excellent problem-solving skills and the ability to work collaboratively with cross-functional teams.
Strong communication and presentation skills to effectively translate technical concepts to business stakeholders.
Preferred Qualifications
Familiarity with data visualization and business intelligence tools (e.g., Tableau, Qlik, etc.).
Knowledge of machine learning and artificial intelligence concepts and their application in data-driven initiatives.