Enable job alerts via email!

Data Engineer

Astra-North Infoteck Inc. ~ Conquering today’s challenges, achieving tomorrow’s vision!

Toronto

On-site

CAD 90,000 - 110,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading tech firm in Toronto seeks a highly skilled Data Engineer to build reliable, scalable data pipelines transforming semi-structured data into analytics-ready datasets. Candidates should have strong expertise in Python and SQL, along with 6-8 years of experience. The successful applicant will also design and maintain database schemas, establish data quality and governance, and prepare datasets for Power BI dashboards in a collaborative environment.

Qualifications

6–8 years of experience as a Data Engineer or similar role.
Deep hands‑on engineering ability in Python and SQL.
Strong data modeling skills with clear communication.

Responsibilities

Build reliable, scalable data pipelines from semi‑structured data.
Design and maintain database schemas in SQLite, Postgres, and Dremio.
Establish dataset versioning, lineage tracking, and data contract/documentation.
Prepare clean datasets for Power BI dashboards and reporting.

Skills

Python

SQL

Data Modeling

ETL/ELT

DBeaver

SQLite/Postgres/Dremio

API Integration

Job Description: Data Engineer (Python | SQL | Semi‑Structured Data | ES APIs)

Experience Required: 6–8 years

Skills: Python, SQL, Data Modeling, ETL/ELT, DBeaver, SQLite/Postgres/Dremio, API Integration

Role Summary

We are seeking a highly skilled Data Engineer with strong Python and SQL expertise to build reliable, scalable data pipelines that transform semi‑structured data from ES URLs/APIs into clean, analytics‑ready datasets. You will work primarily in a local environment (Python, DBeaver, SQLite/Postgres/Dremio), establish database connections, flatten and normalize JSON/Elasticsearch topics, and prepare datasets for downstream Power BI reporting. This role requires deep hands‑on engineering ability, strong data modeling skills, and clear communication with business stakeholders.

Key Responsibilities

Data Ingestion & Transformation
- Extract semi‑structured data from ES URLs, API endpoints, and Elasticsearch topics (JSON-based).
- Flatten, normalize, and structure nested JSON into relational tables suitable for analytics.
- Build reproducible ETL/ELT workflows using Python (pandas, NumPy, SQLAlchemy, requests).
- Implement transformation logic, incremental loads, and schema alignment for downstream use.
Database Engineering
- Design, create, and maintain database schemas in SQLite, Postgres, and Dremio.
- Configure and manage local DB connections through DBeaver.
- Optimize queries using indexing strategies, caching, and partitioning.
- Implement performance tuning for Python data jobs and SQL queries.
Data Quality & Governance
- Build and maintain validation rules, deduplication logic, and anomaly detection.
- Establish dataset versioning, lineage tracking, and data contract/documentation.
- Ensure secure handling of API credentials, tokens, and data source endpoints.
- Use Git for version control, perform code reviews, write unit tests, and support CI checks.
- Produce clear documentation, runbooks, and support materials for ad‑hoc data requests.
Reporting & Downstream Enablement
- Prepare clean, analytics‑ready datasets for use in Power BI dashboards and business reporting.
- Collaborate with stakeholders to translate business requirements into technical data solutions.
- Ensure accurate, complete, and timely delivery of data to reporting teams.

Required Skills & Experience

Programming & Data Engineering
- Strong hands‑on experience with Python (pandas, NumPy, SQLAlchemy, requests).
- Ability to work with and transform semi‑structured JSON/ES data.
- Experience integrating with REST APIs, ES endpoints, or similar data sources.
SQL & Databases
- Advanced SQL proficiency across SQLite, Postgres, and Dremio querying.
- Understanding of dimensional modeling, normalization, and modeling nested/semi‑structured data.
- Experience with query tuning, indexing, and performance optimization.
Tools & Pipelines
- Proficient in DBeaver (database connections, schema management).
- Experience building ETL/ELT pipelines with error handling, logging, and recoverability.
- Familiarity with dataset preparation for Power BI.
Collaboration & Delivery
- Strong communication skills; ability to work closely with business stakeholders.
- Experience translating requirements into technical specifications and deliverables.

Preferred / Bonus Skills

Experience with Elasticsearch, ES endpoints, scroll APIs, or schema‑on‑read engines (e.g., Dremio).
Familiarity with Docker for reproducing local environments.
Experience with schedulers such as Airflow, Prefect, or similar orchestration tools.
Knowledge of performance profiling tools (EXPLAIN plans, indexing strategies, caching).

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions