Enable job alerts via email!

Junior Big Data Engineer (PySpark)

Citigroup Inc.

Mississauga

On-site

CAD 70,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A leading financial services company in Peel Region is seeking a Junior Big Data Engineer. This role involves designing scalable ETL pipelines and managing big data infrastructure using technologies such as Apache Spark and Hadoop. Candidates should have a strong proficiency in programming languages like Python or Scala and experience with cloud platforms. The ideal candidate holds a Bachelor's degree and possesses a passion for data engineering.

Qualifications

  • Design, build, and maintain scalable ETL/ELT pipelines.
  • Develop and manage large-scale data processing systems.
  • Strong expertise in SQL and various database technologies.

Responsibilities

  • Partner with management teams for seamless function integration.
  • Identify and define necessary system enhancements.
  • Provide in-depth analysis to develop innovative solutions.
  • Serve as advisor or coach to mid-level developers.

Skills

Data Pipeline Development
Big Data Infrastructure
Proficiency in Python or Scala
Strong expertise in data processing frameworks
Expertise in Data Lakehouse technologies
Experience with cloud data platforms (AWS, Azure, GCP)
Expertise in SQL and database technologies
Experience with data orchestration tools
Familiarity with containerization (Docker, Kubernetes)

Education

Bachelor’s degree or equivalent experience
Master’s degree preferred

Tools

Apache Spark
Hadoop
Kafka
Apache Airflow
Docker
Kubernetes
Job description
The Junior Big Data Engineer is a senior level position responsible for establishing and implementing new or revised application systems and programs in coordination with the Technology team. The overall objective of this role is to lead applications systems analysis and programming activities.

Responsibilities:

  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.
Qualifications:
  • Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines to ingest, transform, and load data from multiple sources.
  • Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka.
  • Proficiency in programming languages like Python, or Scala.
  • Strong expertise in data processing frameworks such as Apache Spark, Hadoop.
  • Expertise in Data Lakehouse technologies (Apache Iceberg, Apache Hudi, Trino)
  • Experience with cloud data platforms like AWS (Glue, EMR, Redshift), Azure (Synapse), or GCP (BigQuery).
  • Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.).
  • Experience with data orchestration tools like Apache Airflow or Prefect.
  • Familiarity with containerization (Docker, Kubernetes) is a plus
Good to Have Skills:
  • Distributed caching solutions (Hazelcast or Redis)
  • Prior experience with building distributed, multi-tier applications is highly desirable.
  • Experience with building apps which are highly performant and scalable will be great
Education:
  • Bachelor’s degree/University degree or equivalent experience
  • Master’s degree preferred

This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi. View Citi’s EEO Policy Statement and the Know Your Rights poster.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.