
Enable job alerts via email!
A leading software engineering firm based in Canada is seeking experienced professionals in data engineering with strong expertise in the Hadoop ecosystem. Candidates should have proficiency in SQL, PySpark, and experience with cloud-native data lakes. Ideal for individuals with a robust understanding of data ingestion and transformation processes in large-scale environments. This position emphasizes collaboration and efficiency in data warehousing and ETL processes.
Technical Expertise
Strong hands-on expertise in Hadoop ecosystem (HDFS, Hive, Spark, Oozie, Yarn, HBase, Kafka, Zookeeper).
Deep understanding of data ingestion, transformation, and storage patterns in large-scale environments.
Experience with distributed computing, data partitioning, and parallel processing.
Proficiency in SQL , PySpark , Scala , or Java .
Familiarity with cloud-native data lakes on AWS (EMR, Glue, S3) , Azure (HDInsight, ADLS, Synapse) , or GCP (Dataproc, BigQuery).
Knowledge of data governance tools (Apache Atlas, Ranger, Collibra) and workflow orchestration tools (Airflow, Oozie).
Experience integrating EDL with modern lakehouse platforms (Databricks, Snowflake, Synapse, BigQuery).
Understanding of machine learning pipelines and real-time analytics use cases.
Exposure to data mesh or domain-driven data architectures .
Certifications in Hadoop, Cloudera, AWS, or Azure data services.