Enable job alerts via email!

Senior Data Scientist, Applied AI

ZoomInfo

Toronto

Hybrid

CAD 100,000 - 130,000

Full time

2 days ago
Be an early applicant

Job summary

A leading data technology company in Toronto is seeking a Senior Data Scientist to enhance core datasets and solve complex data challenges in an AI-focused environment. The ideal candidate has extensive experience in ML/NLP with a strong track record in modern AI architectures and NER systems. This role offers hybrid work flexibility and competitive compensation with comprehensive benefits.

Benefits

Comprehensive benefits
Holistic wellness programs

Qualifications

  • 6+ years hands-on ML/NLP experience with revenue-impacting products
  • Deep expertise in modern AI architectures including transformer stacks
  • Track record building NER or entity-resolution systems at large scale
  • Proficiency in Python and familiarity with Go/Java

Responsibilities

  • Invent and productionize Transformer / RAG architectures
  • Prototype and launch hybrid dense/sparse retrieval pipelines
  • Own high-recall NER models that tag entities across texts
  • Build cross-dataset entity-resolution frameworks
  • Design agentic workflows with evaluation frameworks

Skills

ML / NLP experience
AI architectures expertise
NER systems building
Data deduplication
Python programming
Executive communication skills

Education

PhD / Master's in relevant field

Tools

PyTorch
TensorFlow

Job description

Join ZoomInfo's mission to build the next-generation go-to-market platform! ZoomInfo is redefining how 40,000+ revenue teams find, engage, and win customers. As a Senior Data Scientist on our Foundation Data team, you'll be the end-to-end owner of critical projects that enhance the quality and reliability of our core datasets. You'll work at the intersection of cutting-edge AI and massive-scale data processing to solve complex entity resolution challenges that directly impact millions of sales and marketing professionals worldwide. You will own core retrieval, NER, and aligned entity-resolution & knowledge-graph initiatives that touch billions of records and serve millions of daily queries.

What You'll Do :

  • Invent and productionize Transformer / RAG architectures that surface the right contact, company, or insight while driving quantization, distillation, and SLM fine-tuning (GTE-Qwen, modernBERT) so models stay fast and affordable at petabyte scale
  • Prototype and launch hybrid dense / sparse retrieval pipelines on vector DBs to build language-agnostic clustering and classification systems that power our intelligence layer
  • Own high-recall NER models that tag people, orgs, locations, and industry-specific entities across multi-language text, extracting structured insights from web data to improve our signal detection capabilities
  • Build cross-dataset entity-resolution frameworks that dedupe and merge hundreds of millions of fragmented company and person records with sub-second latency, creating enriched, unified entities enhanced with knowledge-graph signals
  • Design and implement agentic workflows with robust evaluation frameworks focused on NER and entity resolution tasks, including large-scale A / B and back-testing plans that close the loop from experiment to KPI uplift
  • Scale ML solutions and drive cross-functional impac t by partnering with ML engineers to ensure production reliability, translating product goals into measurable ML KPIs, and influencing roadmap and investment decisions while mentoring junior scientists and engineers
  • Drive end-to-end project ownership from problem definition through deployment, collaborating closely with engineering and product teams to understand business requirements and translate them into scalable ML solutions that enhance foundation data quality across company firmographics, professional demographics, C-suite profiles, and web-extracted signals

What you bring :

  • 6+ years hands-on ML / NLP experience (or 3+ years post-PhD / Master's) with at least two delivered, revenue-impacting products in production environments
  • Deep expertise in modern AI architectures including transformer stacks (BERT / GPT / T5), RAG systems, vector-based information retrieval, and latency / throughput optimization techniques
  • Proven track record building NER or entity-resolution systems at 100M+ record scale with experience in record linkage, data deduplication, and knowledge-graph integration
  • Strong applied research capabilities (PyTorch or TensorFlow) paired with software-engineering rigor (Python, Go / Java a plus) and familiarity with embedding models and vector search technologies
  • Executive communication skills with ability to persuade technical and non-technical audiences through data-driven storytelling, comfortable owning strategy, budget, and cross-functional collaboration
  • LI-Hybrid

    LI-VC1

    Actual compensation offered will be based on factors such as the candidate’s work location, qualifications, skills, experience and / or training. Your recruiter can share more information about the specific salary range for your desired work location during the hiring process. We want our employees and their families to thrive.

    In addition to comprehensive benefits we offer holistic mind, body and lifestyle programs designed for overall well-being.

    Get your free, confidential resume review.
    or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.