Job Responsibilities
- Architect and implement robust data integration pipelines to extract, transform, and load data from various sources (e.g., databases, SaaS applications, APIs, and flat files) into a centralized data platform.
- Design and develop complex ETL (Extract, Transform, Load) processes to ensure data quality, consistency, and reliability.
- Optimize data transformation workflows to improve performance and scalability.
- Implement and maintain data ingestion, processing, and storage solutions to support the organization's data and analytics requirements.
- Ensure the reliability, security, and availability of the data infrastructure through effective monitoring, troubleshooting, and disaster recovery planning.
- Collaborate with the data governance team to establish data policies, standards, and procedures.
- Develop and maintain a comprehensive metadata management system to ensure data lineage, provenance, and traceability.
- Implement data quality control measures and data validation processes to ensure the integrity and reliability of the data.
Desired Candidate Profile
- 5-6 years of experience as a Data Engineer or a related role in a data-driven organization.
- Proficient in designing and implementing data integration and ETL pipelines using tools such as Apache Airflow, airbyte, or any cloud-based data integration services.
- Strong experience in setting up and managing data infrastructure, including data lakes, data warehouses, and real-time streaming platforms (e.g. Elastic, Google Bigquery, Mongodb).
- Expertise in data modeling, data quality management, and metadata management.
- Proficient in programming languages such as Python or Java, and experience with SQL.
- Familiarity with cloud computing platforms (e.g., AWS, Google Cloud) and DevOps practices.
- Excellent problem-solving skills and the ability to work collaboratively with cross-functional teams.
- Strong communication and presentation skills to effectively translate technical concepts to business stakeholders.
Preferred Qualifications
- Familiarity with data visualization and business intelligence tools (e.g., Tableau, Qlik, etc.).
- Knowledge of machine learning and artificial intelligence concepts and their application in data-driven initiatives.