Overview
Sr Data Engineer (LLM & Agent Applications) — Toronto, ON — + Months
Project Scope
We are seeking an experienced Senior Data Engineer to join our team and play a pivotal role in building and managing large-scale data pipelines, with a focus on supporting the development of Large Language Models (LLMs) and agent-based applications. In addition to your technical expertise, you will manage and mentor junior data engineers, helping them grow while ensuring high standards of data engineering practices across the team.
Day-to-Day Responsibilities
- Lead and Manage: Oversee the work of junior data engineers, providing mentorship and guidance to drive the successful execution of projects.
- Develop Data Pipelines: Design and implement scalable and reliable data pipelines to handle increasing data complexity and volume for LLM and agent applications.
- LLM & GenAI Application Support: Develop and optimize data infrastructures to meet the needs of predictive modeling, machine learning, and generative AI applications.
- Collaborate Across Teams: Work closely with data scientists, machine learning engineers, and business stakeholders to understand data requirements and deliver high-quality data solutions.
- Data Integration: Extract, transform, and load (ETL) large datasets from a variety of structured and unstructured data sources using APIs and other technologies.
- Documentation & Best Practices: Create and maintain clear, concise technical documentation for data engineering workflows, pipelines, and processes.
- Mentorship & Growth: Foster a collaborative environment by mentoring junior team members in best practices, new technologies, and approaches in data engineering.
Must Haves
- Bachelor’s degree in computer science, Engineering, or a related field. Advanced degree is a plus. (Candidates must have education from the top University's worldwide!)
- years of experience in data or analytics engineering or related roles.
- Proficiency in Data Engineering Technologies: Advanced skills in SQL, Python, and ETL frameworks for building data pipelines.
- Experience with APIs & Data Integration: Strong experience in working with APIs to extract, transform, and load data from multiple sources, including structured and unstructured data formats (JSON, XML).
- Data Storage & Modeling: In-depth knowledge of data modeling and storage solutions for both structured and unstructured data, as well as cloud data technologies like Google BigQuery and Azure Data Lake.
- Leadership & Communication Skills: Strong leadership abilities to mentor and lead junior engineers. Excellent communication and collaboration skills to work cross-functionally with teams.
- Problem Solving: Proven ability to address complex data challenges, with a strong focus on data optimization, performance, and quality assurance.
Plusses
- If a candidate is highly qualified for the role but does not meet the specified educational requirement (listed above), they must instead possess a Master's degree in either Computer Science or Statistics.
- Machine Learning & AI Expertise: Familiarity with machine learning models, including large language models (LLMs) and generative AI techniques, and an understanding of how to build and optimize data pipelines to support these applications.
Mindlance is an equal opportunity employer. We are committed to inclusive, equitable, barrier-free recruitment and selection processes, and work environment in accordance with the Accessibility for Ontarians with Disabilities Act (AODA). We will be happy to work with applicants requesting accommodation at any stage of the hiring process.