The Data Engineer will be responsible for developing, managing, and optimizing ETL pipelines and data flows to and from the data warehouse. This role involves working closely with the ETL architect, designing and automating ETL processes, and ensuring high-quality data delivery to end users. The position will focus on utilizing Azure Data Factory, Snowflake, and various other data tools to create efficient and reliable ETL pipelines for processing both structured and unstructured data.
Key Responsibilities:
- Develop and maintain effective working relationships across departments to ensure smooth coordination and data flow.
- Communicate efficiently with the ETL architect to understand business requirements and transform data accordingly.
- Assist in the architecture of ETL design and contribute to designing, implementing, and automating ETL flows.
- Investigate and identify potential issues in ETL pipelines, proposing appropriate solutions.
- Develop and manage ETL pipelines in and out of the data warehouse using Azure Data Factory and Snowflake tools.
- Design and implement idempotent ETL processes to allow for re-runs without errors in case of failures or interruptions.
- Work with Snowflake Virtual Warehouses and automate data pipelines using Snowpipe.
- Capture changes in data dimensions, maintain version control, and schedule updates using Snowflake's Stream sets and Tasks.
- Optimize the performance of data movement both during travel and while at rest in the database.
- Build and manage efficient orchestration systems for job scheduling, workflow execution, and data quality checks.
- Conduct testing on ETL system code, data design, and pipelines, and resolve production issues.
- Document implementations, test cases, and deployment details for CI/CD processes.
Required Qualifications:
- 5+ years of experience in data engineering, specifically focusing on data warehousing.
- 2+ years of experience creating pipelines in Azure Data Factory (ADF).
- 5+ years of experience in ETL development using tools such as Informatica PowerCenter, SSIS, or similar.
- 5+ years of experience with relational databases like Oracle, Snowflake, and SQL Server.
- 3+ years of experience writing stored procedures using Oracle PL/SQL, SQL Server T-SQL, or Snowflake SQL.
- 2+ years of experience with source control tools such as GitHub or SVN.
- 2+ years of experience processing both structured and unstructured data.
- Experience with HL7 and FHIR standards and processing files in these formats.
- 3+ years of experience analyzing project requirements and developing detailed ETL specifications.
- Excellent problem-solving and analytical skills with a focus on optimizing data pipelines.
- Ability to adapt to new technologies and changing business needs.
- Bachelor’s or Advanced Degree in Information Technology, Computer Science, Mathematics, Statistics, Analytics, or Business.
Preferred Qualifications:
- 2+ years of experience with batch or PowerShell scripting.
- 2+ years of experience with Python scripting.
- 3+ years of data modeling experience in a data warehouse environment.
- Familiarity with Informatica Intelligent Cloud Services (Data Integration).
- Experience designing and building APIs in Snowflake and ADF (e.g., REST, RPC).
- Experience with State Medicaid, Medicare, or Healthcare applications.
Certifications:
Azure certifications related to data engineering or data analytics are preferred.