Enable job alerts via email!

Senior Specialized IT Consultant

Cynet Systems Inc

Toronto

On-site

CAD 80,000 - 120,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a skilled ETL Developer to design, develop, and optimize data pipelines using cutting-edge technologies such as Azure Databricks and Delta Lake. In this pivotal role, you will work closely with data architects and business teams to ensure efficient data movement and transformation, while implementing best practices in data quality and governance. The ideal candidate will have a strong background in SQL, Python, and ETL processes, along with a passion for driving data-driven insights. Join a dynamic team where your expertise will directly impact the success of data initiatives and help shape the future of data management.

Benefits

Health Insurance
Flexible Working Hours
Professional Development Opportunities
Remote Work Options
Paid Time Off
Team Building Activities

Qualifications

  • 7+ years of experience with SQL Server, T-SQL, Oracle, PL/SQL development.
  • 2+ years of experience with Azure Data Factory and Databricks.
  • Strong knowledge of ETL processes and data modeling.

Responsibilities

  • Design and optimize ETL processes in Databricks for data warehousing.
  • Collaborate with teams to ensure efficient data transformation.
  • Conduct data quality checks and performance reviews.

Skills

SQL
Python
ETL Development
Data Modeling
Data Quality Management
Agile Methodologies
Data Warehousing
Data Analysis
Change Data Capture
Performance Optimization

Education

Bachelor's Degree in Computer Science or related field

Tools

Azure Databricks
Delta Lake
Oracle
SQL Server
Azure Data Factory
PySpark
Unity Catalog

Job description

Job Description:

Responsibilities:
  1. Designing, developing, maintaining, and optimizing ETL (Extract, Transform, Load) processes in Databricks for data warehousing, data lakes, and analytics.
  2. Working closely with data architects and business teams to ensure efficient transformation and movement of data to meet business needs, including handling Change Data Capture (CDC) and streaming data.
  3. Reviewing business requirements, understanding business rules and the transactional data model.
  4. Defining conceptual, logical, and physical models mapping from data source to curated model and data mart.
  5. Analyzing requirements and recommending changes to the physical model.
  6. Developing scripts for the physical model, creating database and/or Delta Lake file structure.
  7. Accessing Oracle DB environments and setting necessary tools for developing solutions.
  8. Implementing data design methodologies, historical and dimensional models.
  9. Performing data profiling, assessing data accuracy, and designing data quality and master data management rules.
  10. Conducting Functionality Review, Data Load review, Performance Review, and Data Consistency checks.
  11. Assisting in troubleshooting data mart design issues.
  12. Reviewing ETL performance with developers and suggesting improvements.
  13. Participating in end-to-end integrated testing for Full Load and Incremental Load and advising on issues.
Tools Used:
  1. Azure Data bricks, Delta Lake, Delta Live Tables, and Spark to process structured and unstructured data.
  2. Azure Databricks/PySpark (good Python/PySpark knowledge required) to build transformations of raw data into curated zone in the data lake.
  3. Azure Databricks/PySpark/SQL (good SQL knowledge required) to develop and/or troubleshoot transformations of curated data into FHIR.
  4. Understand requirements and recommend changes to models to support ETL design.
  5. Define primary keys, indexing strategies, and relationships to enhance data integrity and performance.
  6. Define initial schemas for each data layer.
  7. Assist with data modeling and updates to source-to-target mapping documentation.
  8. Document and implement schema validation rules to ensure incoming data conforms to expected formats and standards.
  9. Design data quality checks within the pipeline to catch inconsistencies, missing values, or errors early in the process.
  10. Communicate with business and IT experts on changes required to conceptual, logical, and physical models.
  11. Develop ETL strategy and solution for different data modules.
  12. Understand tables and relationships in the data model.
  13. Create low-level design documents and test cases for ETL development.
  14. Implement error-catching, logging, retry mechanisms, and handling data anomalies.
  15. Create workflows and pipeline design.
  16. Develop and test data pipelines with Incremental and Full Load.
  17. Develop high-quality ETL mappings, scripts, and notebooks.
  18. Maintain pipeline from Oracle data source to Azure Delta Lakes and FHIR.
  19. Perform unit testing.
  20. Ensure performance monitoring and improvement.
  21. Review ETL performance and troubleshoot performance issues.
  22. Log activity for each pipeline and transformation.
  23. Optimize ETL performance.
  24. Conduct end-to-end integrated testing for Full Load and Incremental Load.
  25. Plan for Go Live, Production Deployment.
  26. Create production deployment steps.
  27. Configure parameters and scripts for go-live testing and review.
  28. Create release documents and help build and deploy code across servers.
  29. Provide Go Live Support and post-deployment reviews.
  30. Review existing ETL process, tools, and recommend performance improvements.
  31. Review infrastructure and remediate issues for process improvement.
  32. Knowledge Transfer to Ministry staff, including developing documentation and sharing ETL end-to-end design, troubleshooting steps, configuration, and scripts.
Experience:
  1. 7+ years working experience with SQL Server, T-SQL, Oracle, PL/SQL development or similar relational databases.
  2. 2+ years working experience with Azure Data Factory, Databricks, and Python development.
  3. Experience building data ingestion and change data capture using Oracle Golden Gate.
  4. Experience in designing, developing, and implementing ETL pipelines using Databricks and related tools to ingest, transform, and store large-scale datasets.
  5. Experience in leveraging Databricks, Delta Lake, Delta Live Tables, and Spark to process structured and unstructured data.
  6. Experience working with data warehouses, delta, and full loads.
  7. Experience in data modeling and tools (e.g., SAP Power Designer, Visio or similar).
  8. Experience developing in an Agile environment.
  9. Experience with SQL Server SSIS or other ETL tools.
  10. Solid knowledge and experience with SQL scripting.
  11. Experience developing, testing, and documenting ETL pipelines.
  12. Experience analyzing, designing, and implementing data validation techniques.
  13. Ability to utilize SQL to perform DDL tasks and complex queries.
  14. Good knowledge of database performance optimization techniques.
  15. Ability to assist in the requirements analysis and subsequent developments and conduct unit testing and assist in test preparations to ensure data integrity.
  16. Work closely with Designers, Business Analysts and other Developers; Liaise with Project Managers, Quality Assurance Analysts and Business Intelligence Consultants.
  17. Design and implement technical enhancements of Data Warehouse as required.
Technical Skills (70 points):
  1. Experience in developing and managing ETL pipelines, jobs, and workflows in Databricks.
  2. Deep understanding of Delta Lake for building data lakes and managing ACID transactions, schema evolution, and data versioning.
  3. Experience automating ETL pipelines using Delta Live Tables, including handling Change Data Capture (CDC) for incremental data loads.
  4. Proficient in structuring data pipelines with the Medallion Architecture to scale data pipelines and ensure data quality.
  5. Hands-on experience developing streaming tables in Databricks using Structured Streaming and readStream to handle real-time data.
  6. Expertise in integrating CDC tools like GoldenGate or Debezium for processing incremental updates and managing real-time data ingestion.
  7. Experience using Unity Catalog to manage data governance, access control, and ensure compliance.
  8. Skilled in managing clusters, jobs, autoscaling, monitoring, and performance optimization in Databricks environments.
  9. Knowledge of using Databricks Autoloader for efficient batch and real-time data ingestion.
  10. Experience with data governance best practices, including implementing security policies, access control, and auditing with Unity Catalog.
  11. Proficient in creating and managing Databricks Workflows to orchestrate job dependencies and schedule tasks.
  12. Strong knowledge of Python, PySpark, and SQL for data manipulation and transformation.
  13. Experience integrating Databricks with cloud storage solutions such as Azure Blob Storage, AWS S3, or Google Cloud Storage.
  14. Familiarity with external orchestration tools like Azure Data Factory.
  15. Implementing logical and physical data models.
  16. Knowledge of FHIR is an asset.
Design Documentation and Analysis Skills (20 points):
  1. Demonstrated experience in creating design documentation such as: Schema definitions, Error handling and logging, ETL Process Documentation, Job Scheduling and Dependency Management, Data Quality and Validation Checks, Performance Optimization and Scalability Plans, Troubleshooting Guides, Data Lineage and Security and Access Control Policies applied within ETL.
  2. Experience in Fit-Gap analysis, system use case reviews, requirements reviews, coding exercises and reviews.
  3. Participate in defect fixing, testing support and development activities for ETL.
  4. Analyze and document solution complexity and interdependencies including providing support for data validation.
  5. Strong analytical skills for troubleshooting, problem-solving, and ensuring data quality.
Communication and Leadership Skills (10 points):
  1. Ability to collaborate effectively with cross-functional teams, provide technical guidance, support to other team members on Databricks best practices and communicate complex technical concepts to non-technical stakeholders.
  2. Strong problem-solving skills and experience working in an Agile or Scrum environment.
  3. Must have previous work experience in conducting Knowledge Transfer sessions, ensuring the resources will receive the required knowledge to support the system.
  4. Must develop documentation and materials as part of a review and knowledge transfer to other members.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Specialized IT Consultant

CYNET SYSTEMS

Toronto

On-site

CAD 80,000 - 110,000

Yesterday
Be an early applicant

Specialized IT Consultant - Senior

2iResourcing

Toronto

On-site

CAD 80,000 - 120,000

30+ days ago