Enable job alerts via email!

Senior Machine Learning Engineer

Klue

Toronto

Hybrid

CAD 80,000 - 130,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Senior Machine Learning Engineer, where you'll work on cutting-edge LLM-powered agents that automate workflows. This role offers a unique opportunity to shape the future of insight generation systems while collaborating with a talented team. You'll dive deep into ML applications, optimizing retrieval systems, and developing scalable ML tools. With a hybrid working style, you can enjoy the flexibility of remote work while still engaging with your team in the office. If you're passionate about machine learning and eager to make a real impact, this is the perfect opportunity for you to thrive.

Benefits

Extended health & dental benefits
Employee Stock Option Plan
Flexible PTO
Direct access to leadership team

Qualifications

  • 2+ years of experience in building and optimizing retrieval systems.
  • Deep understanding of LLMs and their trade-offs in performance.

Responsibilities

  • Optimize LLM-based agents and deploy ML services to production.
  • Measure and improve retrieval systems and develop evaluation metrics.

Skills

Machine Learning
Natural Language Processing (NLP)
Prompt Engineering
Software Testing
Continuous Integration
Hyperparameter Optimization
Data Analysis
Problem Solving

Education

Masters in Machine Learning
PhD in Machine Learning

Tools

PyTorch
Transformers
spaCy
Elasticsearch
Docker
Kubernetes
GCP
Weights & Biases
MLflow

Job description

Klue Engineering is hiring!

We're looking for a Senior Machine Learning Engineer to join our ML Foundation and Platform team in Toronto, focusing on building and optimizing state-of-the-art LLM-powered agents that can reason, plan and automate workflows for users. You'll be joining us at an exciting time as we reinvent our insight generation systems, making this an excellent opportunity for someone with strong ML and IR fundamentals who wants to dive deep into practical LLM applications.

Responsibilities

As a member of our team, you'll be focusing on optimizing LLM-based agents, creating a platform for other teams to utilize ML capabilities and deploying ML services to production.

You'll measure and improve retrieval systems across the spectrum from BM25 to semantic search and develop comprehensive evaluation metrics to measure their performance. A key challenge will be developing optimal chunking and enrichment strategies for diverse data sources including news articles, website changes, documents, CRM entries, call recordings and internal communications. You'll explore how different data types and formats impact retrieval performance and develop strategies to maintain high relevance across all sources.

Beyond agents and retrieval, you'll work on building a platform for other teams to effectively utilize LLM tools and take advantage of prompt engineering. This includes developing APIs and scalable systems, developing scalable tools and services to handle machine learning training and inference for our clients, writing zero-shot and few-shot prompts with structured inputs/outputs, and implementing benchmarking systems for prompts.

You'll also work on training and fine-tuning smaller, more efficient models that can match the performance of LLMs at a fraction of the cost. This includes creating labeled datasets (sometimes using prompts), conducting careful hyperparameter optimizations, and building automated training pipelines. You'll also deploy and monitor these models in production, optimize their latency, and implement comprehensive offline/online metrics to track their performance.

Throughout all this work, you'll apply your deep understanding of the latest breakthroughs to build scalable, production-ready systems that turn cutting-edge ML experiments into reliable business value.

Experience Required
  • Masters or PhD in Machine Learning, NLP, or related field
  • 2+ years building and optimizing retrieval systems
  • 2+ years training/fine-tuning transformer models
  • Deep understanding of LLMs, retrieval metrics and their trade-offs
  • Implement memory and tool-use strategies to enhance LLM-based agent capabilities
  • Experience building end-to-end systems as a Platform Engineer, MLOps Engineer, or Data Engineer
  • Strong understanding of software testing, benchmarking, and continuous integration
  • Build scalable, production-ready ML pipelines for training, evaluation, deployment and monitoring
  • Develop and implement CI/CD pipelines. Automate the deployment and monitoring of ML models.
  • Knowledge of query augmentation and content enrichment strategies
  • Expertise in automated LLM evaluation, including LLM-as-judge methodologies
  • Skilled at prompt engineering - including zero-shot, few-shot, and chain-of-thought.
  • Proven ability to balance scientific rigor with driving business impact
What Makes You Thrive at Klue?

We're looking for builders who:

  • Take ownership and run with ambiguous problems
  • Jump into new areas and rapidly learn what's needed to deliver solutions
  • Bring scientific rigor while maintaining a pragmatic delivery focus
  • See unclear requirements as an opportunity to shape the solution
Technologies We Use
  • LLM platforms: OpenAI, Anthropic, open-source models
  • ML frameworks: PyTorch, Transformers, spaCy
  • Search/Vector DBs: Elasticsearch, Pinecone, PostgreSQL
  • MLOps tools: Weights & Biases, MLflow, Langfuse
  • Infrastructure: Docker, Kubernetes, GCP
  • Development: Python, Git, CI/CD
Working Style at Klue
  • Hybrid. Best of both worlds (remote & in-office). You and your team will be in the office 2 days a week.
  • Our main Canadian hubs are in Vancouver and Toronto, and most of our teams are located in EST and PST.
Compensation & Benefits
  • Competitive base salary
  • Benefits: Extended health & dental benefits that kick in Day 1
  • Options: Opportunity to participate in our Employee Stock Option Plan
  • Time off: Take what you need. Just ensure the required work gets done and clear it with your team in advance. The average Klue team member takes 2-4 weeks of PTO per year.
  • Direct access to our leadership team, including our CEO

Not ticking every box? That’s okay. We take potential into consideration. An equivalent combination of education and experience may be accepted in lieu of the specifics listed above. If you know you have what it takes, even if that’s different from what we’ve described, be sure to explain why in your application.

At Klue, we're dedicated to creating an inclusive, equitable and diverse workplace as an equal-opportunity employer. Our commitment is to build a high-performing team where people feel a strong sense of belonging, can be their authentic selves, and are able to reach their full potential. If there’s anything we can do to make our hiring process more accessible or to better support you, please let us know, we’re happy to accommodate.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Machine Learning Engineer

Loopio

Toronto

Remote

CAD 80,000 - 140,000

27 days ago

Senior Machine Learning Engineer

Insight Global

Toronto

Remote

CAD 100,000 - 125,000

16 days ago

Senior Machine Learning Engineer (Remote, Canada)

AuditBoard

Remote

CAD 80,000 - 140,000

4 days ago
Be an early applicant

Senior Machine Learning Engineer

GENIE AI

Old Toronto

Remote

CAD 80,000 - 130,000

30+ days ago

Senior Machine Learning Engineer

Replicant, Inc

Remote

CAD 80,000 - 150,000

8 days ago

Sr. Machine Learning Engineer, Onsite Content Signals

Pinterest

Remote

CAD 80,000 - 120,000

9 days ago

Machine Learning Engineer

Tiger Analytics

Toronto

Remote

CAD 80,000 - 120,000

Yesterday
Be an early applicant

Machine Learning Engineer

Tiger Analytics

Toronto

Remote

CAD 80,000 - 120,000

2 days ago
Be an early applicant

Senior Machine Learning Engineer Remote in Canada

ecobee, Inc.

Toronto

Hybrid

CAD 80,000 - 120,000

28 days ago