Enable job alerts via email!

Azure Data Engineering Lead - Calgary

Capgemini

Calgary

On-site

CAD 80,000 - 100,000

Full time

3 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Data Engineer with a strong background in data engineering and analytics. The role involves developing and optimizing data pipelines and models to support business intelligence initiatives. Ideal candidates will have proficiency in Python, SQL, and experience with cloud platforms like Azure.

Qualifications

Professional experience in data ingestion, ETL, and ELT processes.
Experience working with cloud platforms, preferably Azure.

Responsibilities

Support, maintain, optimize, and create ETL/ELT pipelines using Databricks.
Collaborate with cross-functional teams to deliver solutions.

Skills

Python

SQL

Data Modeling

Spark

Azure

Docker

CI/CD

Education

Degree in Computer Science

Tools

Databricks

PostgreSQL

Kubernetes

Terraform

Job Description

This role requires a strong background in data engineering and analytics, with expertise in various data processing and database technologies. The candidate will be responsible for developing, maintaining, and optimizing data pipelines and models to support business intelligence and analytics initiatives.

Required Skills:

Proficiency in Python, including unit testing and data packages such as Pandas, SQLAlchemy, and Alembic.
Strong SQL skills, including DDL, DML, window functions, CTEs, sub-queries, joins, and performance profiling across platforms like Spark and PostgreSQL.
Experience with Spark, including PySpark, SparkSQL, batch and streaming processing, partitioning, delta tables, and parquet.
Knowledge of Databricks environment, including workflows, clusters, SQL Warehouse, Unity Catalog, and performance profiling.
Understanding of streaming data solutions like Azure EventHubs, with the ability to process and scale streaming data efficiently.
Experience with PostgreSQL, focusing on query optimization, indexing, JSON columns, and performance tuning.
Data modeling expertise, particularly dimensional modeling and normalization for BI tools.
Experience with containerization tools such as Docker.
Familiarity with Infrastructure as Code and CI/CD tools like Kubernetes, Argo, Crossplane, Terraform.
Knowledge of database migration tools like Alembic, Flyway, Liquibase.
Understanding of SQL and NoSQL databases, and how to choose the appropriate database for different use cases.
Ability to query logs using tools like KQL (Azure) or similar.

Desired Skills:

Experience with SQL Server, including query optimization and maintenance.
Familiarity with DBT in Databricks environment.
Knowledge of PowerBI for creating semantic models.
Experience with MLFlow for experiment tracking and model registration.

Qualifications and Experience:

Degree in Computer Science or equivalent experience.
Professional experience in data ingestion, ETL, and ELT processes for structured and unstructured data.
Proficiency in Python and SQL for analytics, database development, and data modeling.
Experience with DevOps and CI/CD pipelines for data applications.
Experience working with cloud platforms, preferably Azure.
Understanding of Agile methodologies and experience working in agile teams.

Responsibilities:

Support, maintain, optimize, and create ETL/ELT pipelines, both batch and streaming, using Databricks (PySpark, DatabricksSQL), Python, SQL, and DBT.
Design and model data objects, with proficiency in dimensional modeling and normalization.
Write and perform tests for data flows.
Collaborate with cross-functional teams including developers, data scientists, and business analysts to deliver solutions.
Coordinate with platform teams to utilize infrastructure efficiently, including CI/CD deployment.
Work in the PST timezone as required.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.