Toronto
On-site
CAD 70,000 - 85,000
Full time
Job summary
A leading technology firm in Toronto is seeking a Data Quality Analyst. The role involves developing test strategies and validating ETL pipelines in Azure and Databricks environments. The ideal candidate will have strong skills in data validation, SQL, and experience with large datasets. This position offers opportunities for collaboration with engineering teams and supports CI/CD workflows.
Qualifications
- Proven experience in cloud-based data platforms.
- Strong knowledge of ETL/ELT processes.
- Hands-on experience with large datasets.
Responsibilities
- Develop and implement test strategies and automation scripts.
- Validate ETL/ELT pipelines built using ADF and Databricks.
- Monitor and validate data quality across Delta tables.
Skills
Data QA/validation
Azure Data Factory
Databricks
SQL
Python
Data profiling
Data reconciliation
Schema validation
DevOps
Performance testing
Tools
Azure DevOps
GitHub Actions
Responsibilities
- Develop and implement test strategies, test cases, and automation scripts to validate data pipelines in Azure and Databricks environments.
- Perform data validation, reconciliation, and comparative analysis between source and target systems.
- Validate ETL/ELT pipelines built using ADF and Databricks.
- Collaborate with Data Engineers and Product Owners to understand STM (Source-to-Target Mapping) and ensure transformation logic is correctly implemented.
- Monitor and validate data quality across Delta tables and Data Warehouses.
- Identify data anomalies, document defects, and drive them to resolution with the engineering team.
- Support CI/CD pipelines by integrating data testing into DevOps workflows.
- Contribute to test data management, metadata validation, and regression testing.
- Provide regular reporting on test execution results, defect metrics, and QA health.
Required Skills
- Proven experience in data QA/validation in cloud-based data platforms.
- Strong knowledge of Azure Data Factory, Databricks.
- Proficiency in SQL and scripting languages such as Python.
- Hands-on experience with data profiling, data reconciliation, and schema validation.
- Understanding of SCD Type 2 and data transformation logic.
- Familiarity with DevOps tools like Azure DevOps or GitHub Actions for CI/CD integration.
- Experience working with large datasets, performance testing, and data lineage tools.