Overview
Design and implement end-to-end data engineering pipelines, including data ingestion, processing, and consumption, to support advanced analytics and business use cases.
Responsibilities
- Design and implement end-to-end data engineering pipelines, including data ingestion, processing, and consumption, to support advanced analytics and business use cases.
- Automate resource deployment and code pipelines using Terraform and AWS native CI/CD services to ensure efficient and scalable solutions.
- Leverage platforms like Databricks, Redshift, and other data tools for advanced data processing, analytics, and machine learning preparation.
- Integrate data from diverse sources (e.g., SAP ECC, Google Drive) and prepare datasets for machine learning, regulatory reporting, and enterprise-level projects.
- Apply DevOps practices to manage data operations and deploy scalable, enterprise-grade solutions while ensuring high performance and reliability.
Qualifications
- Expertise in designing and implementing end-to-end data engineering solutions, including data ingestion, processing, and consumption pipelines.
- Proficient in automating resource deployment and code pipelines using Terraform, and AWS native CI/CD services.
- Hands-on experience with Databricks, Redshift, and other data platforms for advanced data processing and analytics.
- Skilled in integrating data from diverse sources (e.g., SAP ECC, Google Drive) and preparing datasets for machine learning and business use cases.
- Strong background in DevOps practices, including managing data operations and deploying scalable solutions for enterprise-level projects.
Preferred (Good to have) Skills
- Knowledge of Generative AI and its integration into workflows.
- Experience with Google Cloud Platform (GCP) in addition to AWS.
- Advanced knowledge of RAG (Retrieval-Augmented Generation) and AI Refinery.
- Strong understanding of financial data extraction and regulatory reporting.
- Familiarity with Python and other programming languages for data engineering.