Enable job alerts via email!

Python Pyspark Data Engineer with AI, Control M

Astra North Infoteck Inc.

Toronto

On-site

CAD 100,000 - 130,000

Full time

20 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in data solutions is seeking a Data Engineer with extensive experience in Python, Spark, and Databricks. The role involves designing scalable data pipelines, managing Spark workloads, and collaborating with various teams to drive data-driven decisions. Ideal candidates will have a strong background in data engineering and automation practices.

Qualifications

  • 8-10 years of experience in data engineering.
  • Strong experience with Python and Spark (PySpark).
  • Hands-on with Databricks and SQL for complex data transformations.

Responsibilities

  • Design, build, and optimize scalable data pipelines.
  • Deploy and manage large-scale Spark workloads on Databricks.
  • Collaborate with data scientists and business stakeholders.

Skills

Python
Spark (PySpark)
SQL
Databricks
Machine Learning workflows

Tools

Control-M
Snowflake
Airflow

Job description

Keywords : Python and Spark (PySpark),Databricks (Jobs, Workflows, Delta Lake, Unity Catalog,SQL

Role Description :

  • Design, build, and optimize scalable data pipelines
  • Develop and operationalize data products across structured and unstructured sources, including alternative data
  • Deploy, manage, and performance-tune large-scale Spark workloads on Databricks, ensuring reliability, scalability, and cost-efficiency
  • Collaborate with data scientists, quant teams, and business stakeholders to enable data-driven decision-making
  • Contribute to automation efforts via CI / CD pipelines, infrastructure-as-code, and reusable data frameworks

Competencies : Python Web Frameworks, Databricks, PySpark, Control-M_Workload Scheduling and Automation_Administration

Experience (Years) : 8-10

  • Strong experience with Python and Spark (PySpark)
  • Hands-on with Databricks (Jobs, Workflows, Delta Lake, Unity Catalog)
  • Proficient in SQL for complex data transformations and optimizations
  • Solid understanding of distributed data processing and production-grade data workflows
  • Exposure to Machine Learning workflows and tools like MLflow
  • Experience working with Alternative Data sources (e.g., web data, geospatial, satellite, social sentiment)
  • Familiarity with Snowflake, Airflow, or similar orchestration and warehousing platforms
  • Understanding of CI / CD principles, version control, and production deployment best practices
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.