Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
Stratum AI seeks a Machine Learning Ops Engineer for its Infrastructure Team. You will enhance a platform for training and serving AI models in the mining sector, requiring strong Python skills and MLOps expertise. This remote-first role targets applicants based in Canada, offering an opportunity to be part of a cutting-edge venture in mining technology.
We are looking for a high-agency Machine Learning Ops Engineer to join our Infrastructure Team. You will help build and maintain the platform used to train, evaluate, and serve our AI models to clients in the mining industry. Your work will directly support our Technical Services and Platform teams in delivering solutions that create value for mining clients.
This position requires strong expertise in Python and machine learning workflows. You will work alongside a team of three engineers focused on creating robust infrastructure and tooling.
This is a remote-first position, with a preference for applicants based in Canada.
Develop robust and well-tested code for core internal tools:
Create data preprocessing modules for mining data
Implement metrics calculations and evaluation pipelines
Build visualization tools for 3D models and ML performance metrics
Troubleshoot and fix issues in existing metrics code
Build and maintain our custom end-to-end MLOps platform:
Implement experiment tracking systems
Create model registry with versioning and storage
Develop automated testing frameworks
Build interfaces between different components of the ML pipeline
Develop production-grade QA/QC systems for deployed AI models:
Implement input data validation
Create automated alerts for performance issues
Set up monitoring for data drift
Build dashboards for model performance metrics
Create specialized tools for mining data:
Implement spatial data processing utilities
Build visualization tools for 3D geological data
Develop data converters between different mining data formats
Create utilities for coordinate transformations
Refactor and productionize code created by the client services team:
Convert notebooks into modular Python packages
Implement proper error handling and logging
Add comprehensive testing to existing code
Improve performance of data processing pipelines
Provide technical expertise to the client services team
Manage infrastructure for data processing, model training, and serving
Mentor junior engineers, perform code reviews, and write documentation
Proactively identify technical challenges and drive improvement initiatives
Bachelor's degree in Computer Science, Engineering, or related fields OR equivalent experience in software development and ML engineering
3+ years of industry experience
Kubernetes, PyTorch
Advanced Python programming skills:
Proficiency with data science libraries (numpy, pandas)
Experience with visualization tools
Ability to write modular, robust, and tested Python code
Strong debugging skills for complex ML systems
Deep learning experience:
Implementation of neural network models and training workflows
Understanding of model architecture selection
Knowledge of model evaluation techniques
MLOps expertise:
Creating experiment tracking systems
Building model registries and versioning systems
Implementing model deployment pipelines
Setting up monitoring for model performance
Data engineering capabilities:
Experience with SQL and database principles
Familiarity with database frameworks
Ability to create data processing pipelines
Experience handling common mining data formats and transformations
Infrastructure management:
Experience with cloud services (AWS/Azure)
Understanding of containerization (Docker or Singularity)
Knowledge of compute resources for ML
Testing and quality assurance:
Implementing automated tests for ML systems
Creating QA/QC systems for model predictions
Designing validation steps for data inputs/outputs
Ability to write efficient software following best practices
Proven ability to thrive in startup environments with low structure and high autonomy
Strong technical communication skills and ability to collaborate in a remote team setting
Experience working with machine learning in computer vision, NLP, recommender systems, or scientific applications
Strong background in probability, machine learning, and data science
Strong experience with data analysis/processing libraries such as pandas and numpy
Excellent communication skills for both technical and non-technical audiences
Self-learner and motivated to pick up new skills
Previous experience working at startups
Familiarity with Git, experiment tracking tools (WandB, Comet, etc.)
Experience working on production machine learning using tools such as KubeFlow, MLFlow, AirFlow, Seldon Core, DVC, Spark, etc.
Written/oral fluency in a language besides English
Experience optimizing data processing pipelines and/or neural network models
Proficiency in a lower-level programming language or GPU programming
Experience with data application frameworks
Full stack development experience
Experience with experiment tracking systems and ML model monitoring
Background in mining or resource modeling
We're Stratum, a mining software company with machine learning models as our core product. Our 3D maps predict how gold, silver, copper, etc. are distributed (and how much!) using only small amounts of data, unconventional data processing, and proprietary ML protocols. Our work directly affects how much money a mine is going to make next week/month/year while reducing waste/cost. We're supported by Founders Fund, Aramco, Builders VC, Y Combinator, and Ilya Sutskever, former Chief Scientist at OpenAI, who have recognized the potential of our industry-disrupting technology.
Our long-term vision is to build a massive AI engine capable of making every decision in a mining operation, down to moving individual rocks. If you’re an exceptional engineer interested to helping make this vision a reality we look forward to reviewing your application and working together.