Enable job alerts via email!

Senior MLOps

GuruLink

Montreal

Remote

CAD 100,000 - 130,000

Full time

5 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A rapidly scaling company, originating from UC Berkeley's AI Research Lab, seeks an MLOps professional to enhance their technology platform. This role focuses on maintaining and developing MLOps infrastructure, ensuring product stability, and supporting machine learning models. Ideal candidates will have extensive experience in AWS and a deep understanding of the machine learning model lifecycle. The position is fully remote, offering a competitive salary and benefits.

Benefits

Paid Medical, Dental, and Vision premiums
Monthly stipends for Education, Well-being, & WFH
Non-accrual PTO
Growth opportunities
Company retreats
Medical & Family/Parental Leave

Qualifications

  • 5–7 years of experience implementing MLOps at scale.
  • Deep expertise in AWS and machine learning model lifecycle.

Responsibilities

  • Manage sensitive data with secure architecture.
  • Deploy models handling live sensor data using Docker and Python.
  • Manage AWS infrastructure with Terraform and Kubernetes.

Skills

AWS
MLOps
Agile
Automation
Data Management

Tools

Docker
Python
PyTorch
Terraform
Kubernetes
Prometheus
Grafana

Job description

Location: REMOTE / Montreal, Quebec
This job allows you to work remotely.

Our client, which has raised a Series C of $45 million USD, totaling over $100 million in financing, is rapidly scaling. Spinning out of UC Berkeley's AI Research Lab, they develop artificial intelligence solutions to support care for individuals with Alzheimer’s disease, dementia, and other cognitive impairments.

Alzheimer’s disease is the most expensive disease in the US, costing approximately $600 billion annually in direct and indirect costs. It affects 1 in 3 people over 85 and 1 in 9 over 65. Their initial product focuses on reducing falls, the leading cause of hospitalization among those with dementia. Peer-reviewed results show up to 80% fewer falls, with an average reduction of 40%, and an 80% decrease in ER visits due to falls.

Your Role:

Guide and expand the technology platform by maintaining and developing the MLOps infrastructure, including experimentation, training, evaluation systems, and data management.

The ideal candidate will leverage operational skills and an architectural mindset to ensure product stability and accountability, facilitate rapid growth of machine learning models, and troubleshoot issues efficiently.

Responsibilities:
  • Security: Manage sensitive data with secure architecture, including data segmentation, just-in-time access, VPC networking, AWS IAM roles, zero-trust principles, and encryption both in transit and at rest.
  • R&D Enablement: Provide reliable environments for ML researchers using tools like Weights & Biases, NVIDIA/CUDA hardware, Spark or Ray for data processing, and visualization tools, ensuring low cycle times.
  • Model Deployment: Deploy models handling live sensor data using Docker, Python, PyTorch, and GStreamer. Monitor system performance with tools like Prometheus, Loki, Grafana, and OpsGenie.
  • Infrastructure & Stability: Manage AWS infrastructure with Terraform, Kubernetes (Helm), and EC2. Support systems such as Voxel51, MongoDB, Airflow, and vector databases like Qdrant, ensuring stability and observability.
  • Data Management: Securely store, catalog, and manage large volumes of sensor and cloud data, implementing effective data lifecycle and governance practices.
Must Have Skills:
  • 5–7 years of experience implementing MLOps at scale
  • Deep expertise in AWS, with practical knowledge of managed services vs. DIY EC2 solutions
  • Ability to balance infrastructure cost-efficiency with developer productivity
  • Comprehensive understanding of the machine learning model lifecycle
  • Experience maintaining stability across edge devices and server fleets
  • Broad knowledge of ML concepts, including LLM security, vector databases, and hardware optimization
  • Self-motivated, proactive, and autonomous
  • Familiarity with Agile methodologies like Scrum and Kanban
  • Passion for automation and eliminating toil through tooling and processes
  • Understanding of the dual focus of MLOps: experimentation and operational speed/reliability
Nice to Have Skills:
  • Mission-driven company culture
  • Fully remote work environment
  • Competitive salary & benefits, including paid Medical, Dental, and Vision premiums
  • Monthly stipends for Education, Well-being, & WFH
  • Non-accrual PTO
  • Growth opportunities
  • Company retreats
  • Medical & Family/Parental Leave
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior MLOps Developer

Elastic

Remote

CAD 128.000 - 203.000

12 days ago

Senior MLOps Engineer

DeepRec.ai

Toronto

Remote

CAD 120.000 - 160.000

17 days ago

Senior MLOps Developer

Referral Board

Remote

CAD 128.000 - 203.000

16 days ago

Senior MLOps Developer

Elasticsearch B.V.

Remote

CAD 128.000 - 203.000

17 days ago