Overview
Title: Lead Data Engineer
Key skills: Azure Databricks, Azure Synapse, Python, SQL (Azure skills).
Key Responsibilities
- Data Pipeline Development: Develop & deploy data pipelines using Azure OneLake, Azure Data Factory, and Apache Spark, ensuring efficient, scalable, and secure data movement across systems.
- ETL: Implement ETL (Extract, Transform, Load) workflows, optimizing the process for data ingestion, transformation, and storage in the cloud.
- Data Integration: Build and manage data integration solutions that connect multiple data sources (structured and unstructured) into a cohesive data ecosystem. Use SQL, Python, Scala, and R to manipulate and process large datasets.
- Azure OneLake Expertise: Leverage Azure OneLake and Azure Synapse Analytics to design and implement scalable data storage and analytics solutions that support big data processing and analysis.
- Collaboration with Teams: Work closely with Data Scientists, Data Analysts, and BI Engineers to ensure that the data infrastructure supports analytical needs and is optimized for performance and accuracy.
- Data Governance & Security: Implement best practices for data governance, data security, and compliance within the Azure ecosystem, ensuring data privacy and protection.
- Automation & Monitoring: Automate data engineering workflows, job scheduling, and monitoring to ensure smooth operations. Use tools like Azure DevOps, Airflow, and other relevant platforms for automation and orchestration.
- Documentation & Best Practices: Document data pipeline architecture, data models, and ETL processes, and contribute to the establishment of engineering best practices, standards, and guidelines.
- Innovation: Stay current with industry trends and emerging technologies in data engineering, cloud computing, and big data analytics, driving innovation within the team.