Enable job alerts via email!

Senior Data Engineer, AI/ML (Toronto, Hybrid / Remote)

Autodesk

Toronto

Remote

CAD 80,000 - 120,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Sr. Data Engineer and play a pivotal role in shaping the future of machine learning. This position offers the opportunity to design and maintain robust data pipelines, ensuring data quality and availability for innovative AI solutions. Collaborate with talented ML engineers and data scientists to create scalable workflows on AWS, contributing to the development of cutting-edge technologies. If you are passionate about data and eager to make a significant impact in a dynamic environment, this role is perfect for you.

Benefits

Comprehensive benefits package

Discretionary annual cash bonuses

Flexible work hours

Professional development opportunities

Qualifications

3+ years of experience as a Data Engineer in ML or AI environments.
Proficiency in Python and SQL with hands-on AWS experience.

Responsibilities

Design and maintain scalable data pipelines for ML model training.
Automate data workflows using orchestration tools like Apache Airflow.

Skills

Python

SQL

ETL/ELT Engineering

Data Governance

Data Quality

Problem Solving

Education

Bachelor’s or Master’s degree in Computer Science

Tools

AWS

Apache Airflow

Spark

Jupyter

Git

Autodesk is seeking a skilled Sr. Data Engineer to join our Machine Learning Engineering team within the Access domain. This role is critical in enabling the development and deployment of scalable ML solutions by building and maintaining the robust data pipelines, data prep and contributing to infrastructure that powers them. You will work closely with ML engineers, data scientists, and platform teams to design pipelines, prepare training data, and ensure data quality, security, and availability across the lifecycle of our AI/ML products.

Responsibilities

Data Pipeline Development: Design, implement, and maintain scalable, resilient data pipelines to support ML model training, inference, and analytics

ETL/ELT Engineering: Build and manage ETL/ELT workflows to extract data from various sources, transform it for feature engineering, and load it into cloud data warehouses and model stores

Data Preparation for ML: Collaborate with ML engineers to gather, clean, and curate datasets for training and evaluation. Implement processes for labeling, versioning, and partitioning datasets

Data Infrastructure: Contribute to the development of data platforms and infrastructure on AWS (e.g., S3, Glue, Redshift, Athena, EMR) to support ML workflows and high-throughput data processing

Data Quality & Governance: Monitor data integrity, freshness, and availability; enforce data validation checks and lineage tracking to ensure reliable ML model behavior

Automation & Orchestration: Automate data workflows using orchestration tools such as Apache Airflow, AWS Step Functions to enable reproducible and scheduled data operations

Collaboration & Integration: Work closely with ML engineers, product managers, and analytics teams to understand data needs, optimize schemas, and expose data via APIs or query interfaces. You will work closely with business stakeholders to understand and maintain focus on their analytical needs, including identifying critical metrics and KPIs

Performance Optimization: Optimize data queries, storage, and transfer to reduce cost and latency for ML model training and real-time features

Documentation & Best Practices: Document data sources, pipelines, and workflows, and promote engineering best practices across the ML platform team

Minimum Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or related field

3+ years of experience as a Data Engineer working on large-scale data infrastructure, preferably in ML or AI environments

Proficiency in Python and SQL; experience with Spark, PySpark, or other distributed data frameworks

Hands-on experience with AWS, including S3, Glue, Redshift, Athena, Lambda, and Step Functions

Experience building and maintaining data pipelines and orchestration using tools like Airflow, Luigi, or similar

Experience in working and analyzing data on notebook solutions like Jupyter, EMR Notebooks, Apache Zeppelin

Familiarity with data warehousing, stream processing, and data modeling principles

Strong understanding of data lifecycle, governance, versioning, and reproducibility in ML contexts

Experience with version control and CICD tools like Git and Jenkins CI

Proactive problem solver with excellent written and interpersonal skills; ability to make sound, complex decisions in a fast-paced, technical environment

Preferred Qualifications

Experience working in cross-functional ML teams and working with ML Ops teams

Familiarity with ML concepts (e.g., feature stores, model inputs/outputs, retraining triggers)

Experience with tools like Feast, Delta Lake, or data lakehouse architectures.

Exposure to containerization (Docker) and infrastructure-as-code tools (Terraform, CloudFormation)

Learn More

About Autodesk
Welcome to Autodesk! Amazing things are created every day with our software – from the greenest buildings and cleanest cars to the smartest factories and biggest hit movies. We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.

We take great pride in our culture here at Autodesk – our Culture Code is at the core of everything we do. Our values and ways of working help our people thrive and realize their potential, which leads to even better outcomes for our customers.

When you’re an Autodesker, you can be your whole, authentic self and do meaningful work that helps build a better future for all. Ready to shape the world and your future? Join us!

Salary transparency

Salary is one part of Autodesk’s competitive compensation package. Offers are based on the candidate’s experience and geographic location. In addition to base salaries, we also have a significant emphasis on discretionary annual cash bonuses, commissions for sales roles, stock or long-term incentive cash grants, and a comprehensive benefits package.

Job Requisition ID #

25WD88438

Position Overview

Responsibilities

Data Pipeline Development: Design, implement, and maintain scalable, resilient data pipelines to support ML model training, inference, and analytics
ETL/ELT Engineering: Build and manage ETL/ELT workflows to extract data from various sources, transform it for feature engineering, and load it into cloud data warehouses and model stores
Data Preparation for ML: Collaborate with ML engineers to gather, clean, and curate datasets for training and evaluation. Implement processes for labeling, versioning, and partitioning datasets
Data Infrastructure: Contribute to the development of data platforms and infrastructure on AWS (e.g., S3, Glue, Redshift, Athena, EMR) to support ML workflows and high-throughput data processing
Data Quality & Governance: Monitor data integrity, freshness, and availability; enforce data validation checks and lineage tracking to ensure reliable ML model behavior
Automation & Orchestration: Automate data workflows using orchestration tools such as Apache Airflow, AWS Step Functions to enable reproducible and scheduled data operations
Collaboration & Integration: Work closely with ML engineers, product managers, and analytics teams to understand data needs, optimize schemas, and expose data via APIs or query interfaces. You will work closely with business stakeholders to understand and maintain focus on their analytical needs, including identifying critical metrics and KPIs
Performance Optimization: Optimize data queries, storage, and transfer to reduce cost and latency for ML model training and real-time features
Documentation & Best Practices: Document data sources, pipelines, and workflows, and promote engineering best practices across the ML platform team

Minimum Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or related field
3+ years of experience as a Data Engineer working on large-scale data infrastructure, preferably in ML or AI environments
Proficiency in Python and SQL; experience with Spark, PySpark, or other distributed data frameworks
Hands-on experience with AWS, including S3, Glue, Redshift, Athena, Lambda, and Step Functions
Experience building and maintaining data pipelines and orchestration using tools like Airflow, Luigi, or similar
Experience in working and analyzing data on notebook solutions like Jupyter, EMR Notebooks, Apache Zeppelin
Familiarity with data warehousing, stream processing, and data modeling principles
Strong understanding of data lifecycle, governance, versioning, and reproducibility in ML contexts
Experience with version control and CICD tools like Git and Jenkins CI
Proactive problem solver with excellent written and interpersonal skills; ability to make sound, complex decisions in a fast-paced, technical environment

Preferred Qualifications

Experience working in cross-functional ML teams and working with ML Ops teams
Familiarity with ML concepts (e.g., feature stores, model inputs/outputs, retraining triggers)
Experience with tools like Feast, Delta Lake, or data lakehouse architectures.
Exposure to containerization (Docker) and infrastructure-as-code tools (Terraform, CloudFormation)

Learn More

When you’re an Autodesker, you can be your whole, authentic self and do meaningful work that helps build a better future for all. Ready to shape the world and your future? Join us!

Salary transparency

Diversity & Belonging
We take pride in cultivating a culture of belonging and an equitable workplace where everyone can thrive. Learn more here: https://www.autodesk.com/company/diversity-and-belonging

Are you an existing contractor or consultant with Autodesk?

Please search for open jobs and apply internally (not on this external site).

About the company

Autodesk, Inc. is an American multinational software corporation that makes software services for the architecture, engineering, construction, manufacturing, media, education, and entertainment industries.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.