Job Title: DevOps/MLOps (Seniors) Job ID: 76633 Location: Vancouver, British Columbia
Overview: As a Senior DevOps/MLOps on our Artificial Intelligence team, you will be responsible for the reliability and smooth operation of various environments and build automation to improve the reliability and efficiency of code and machine learning model delivery from build to production. Challenge yourself by learning new technologies and applying your skills across our different projects and application domains. You'll get to work in the exciting field of MLOps and will have the opportunity to work with tools unique to machine learning and artificial intelligence.
We use leading-edge technologies to deploy and manage the infrastructure that delivers highly scalable and available services. The role involves cross-team collaboration and communication; you will be working closely with key stakeholders to ensure that product requirements are met. This is an opportunity to influence the design and implementation of systems at scales that many do not get a chance to work at.
What you will be doing:
Automation: Develop tools & frameworks to enhance our CI (Continuous Integration) & CD (Continuous Delivery) automation using industry-standard CI/CD practices
Deployments: Leveraging the above-mentioned CI/CD automation to deploy our services to Kubernetes
Operations: Monitor and ensure smooth operation of our services in various environments
Service Reliability: Occasionally provide support and initial troubleshooting when required by reviewing dashboards and logs to ensure system issues are timely addressed
What you must have:
5+ years of experience in a DevOps/MLOps or in a similar role
Bachelor's degree in computer science or related field, or equivalent work experience
Understanding of computer science fundamentals like threading, OOP, and more
Understanding of software systems concepts such as networking, firewalls, protocols, databases, and more
Understanding of software delivery practices such as Git branching models, configuration management, secret rotation, feature toggling, no-downtime deployments, and more
Experience mentoring Junior and Intermediate DevOps/MLOps
Strong organizational and communication skills
Experience with:
CI/CD tools such as Jenkins
Containerization technology such as Docker, Kubernetes
Python or other similar scripting languages
Databases such as CockroachDB, PostgreSQL
Distributed event streaming platforms such as Kafka
Instrumentation & Monitoring tools such as Splunk, Loki, Zabbix, or Prometheus
Package managers and artifact repositories such as Artifactory, npm
Nice-to-haves:
Artificial intelligence or machine learning tools such as MLFlow, JupyterHub, DVC, TensorFlow
Supporting containerized GPU workloads in Kubernetes