Enable job alerts via email!

Python ETL Developer / Data Engineer - Remote

ipvisibility

Ottawa

Remote

CAD 70,000 - 100,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a skilled professional for managing and optimizing ETL processes. The role includes designing workflows for data ingestion and writing efficient queries. Candidates with experience in Python and data lake management are encouraged to apply.

Qualifications

  • Experience with ETL jobs and data lakes.
  • Skills in Python for mapping XML DTD schema.
  • Ability to tune performance of ETL processes.

Responsibilities

  • Design and develop ETL jobs to ingest and process data.
  • Write queries in Hive or Impala for data analysis.
  • Tune performance of ETL mappings and queries.

Skills

ETL design
Data integration
Python
Hive
Impala
Performance tuning

Job description

  • Reviewing, designing, developing ETL jobs to ingest data into Data Lake, load data to data marts;
  • extract data to integrate with various business applications.
  • Parse unstructured data, semi structured data such XML etc.
  • Design and develop efficient Mapping and workflows to load data to Data Marts
  • Map XML DTD schema in Python (customized table definitions)
  • Write efficient queries and reports in Hive or Impala to extract data on ad hoc basis for data analysis.
  • Identify the performance bottlenecks in ETL Jobs and tune their performance by enhancing or redesigning them.
  • Responsible for performance tuning of ETL mappings and queries.
  • import tables and all necessary lookup tables to facilitate the ETL process required to process daily XML files in addition to processing the very large (multi-terabytes) historical XML data files
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.