We are looking for an experienced Data Engineer to design, develop, and manage high-performance data pipelines and workflows. The ideal candidate will possess advanced-level expertise in SQL and Apache Airflow, with a strong background in Python, Snowflake, and ETL / ELT processes. You will work closely with data analysts, scientists, and other stakeholders to ensure the timely availability, integrity, and performance of data across systems.
Key Responsibilities :
Design, develop, and maintain scalable ETL / ELT pipelines to process large datasets, both structured and unstructured.
Write advanced, efficient SQL queries and transformations within Snowflake to optimize data storage, retrieval, and performance.
Leverage Apache Airflow for workflow orchestration, automating complex data processing tasks and ensuring smooth data pipeline management.
Create reusable and scalable Python scripts to automate data extraction, transformation, and loading (ETL) processes.
Monitor and optimize the performance of data pipelines, ensuring high availability and minimal downtime.
Collaborate with cross-functional data teams to implement data modeling best practices and maintain a clean, well-organized data warehouse.
Integrate and manage cloud-based data infrastructure on platforms like AWS, GCP, or Azure, ensuring data availability and scalability.
Uphold data security, governance, and compliance standards across all data operations.
Required Skills & Qualifications :
Expert-level proficiency in SQL for data querying, transformation, optimization, and performance tuning.
Strong hands-on experience with Apache Airflow, specifically for scheduling and orchestrating complex data pipelines.
Expertise in building and optimizing ETL / ELT solutions and familiarity with data pipeline orchestration.
Proficient in Python for writing scalable and reusable scripts to automate data workflows.
Experience with Snowflake, including schema design, performance tuning, and leveraging Snowflake features such as Snowpipe, Streams, and Tasks.
Familiarity with cloud platforms (AWS, GCP, or Azure) and their data services for integrating data sources.
A solid understanding of data modeling techniques (e.g., Star Schema, Snowflake Schema) and data warehouse concepts.
Experience with Git for version control and CI / CD pipeline integration.
Preferred Skills :
Familiarity with big data processing frameworks like Spark or Databricks.
Knowledge of real-time data streaming tools such as Kafka or Kinesis.
Experience with containerization tools like Docker and Kubernetes for deploying data pipelines.
Understanding of Data Governance, Quality, and Security best practices.
Education & Experience :
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
6-10+ years of experience, with at least 3+ years in a dedicated data engineering role.
Obtém a tua avaliação gratuita e confidencial do currículo.