Enable job alerts via email!
A leading AI solutions provider in the UK is seeking a high-caliber MLOps Engineer to design, automate, and scale end-to-end ML model lifecycles. You will oversee deployment and monitoring, ensuring secure and high-performance AI solutions. Candidates should have 5-8 years of experience in MLOps with proficiency in ML lifecycle tools, containerization, and cloud-native deployments. This role provides opportunities to work with cutting-edge technologies in a collaborative environment.
We are seeking a high-caliber MLOps Engineer to design, automate, and scale end-to-end ML model lifecycles in production. You will own the deployment, monitoring, and governance of models, bridging data science and engineering to deliver secure, high-performance AI solutions at enterprise scale.
Core Responsibilities
Architect & manage ML model deployment pipelines using MLflow, Kubeflow, or similar frameworks.
Build CI/CD pipelines tailored for ML workloads (GitHub Actions, Jenkins, GitLab CI).
Containerize and orchestrate ML services using Docker & Kubernetes (EKS, AKS, GKE).
Integrate models into cloud ML platforms (AWS SageMaker, Azure ML, GCP Vertex AI).
Implement model monitoring for accuracy, drift detection, and retraining automation.
Establish feature stores and integrate with data pipelines (Feast, Tecton, Hopsworks).
Ensure compliance with AI governance, security, and responsible AI practices.
Optimize inference performance and reduce serving latency for large-scale deployments.
Collaborate with cross-functional teams to translate ML research into production-grade APIs.
5-8+ years in MLOps, ML engineering, or DevOps roles.
Deep knowledge of ML lifecycle management tools (MLflow, Kubeflow, SageMaker Pipelines).
Strong containerization skills (Docker, Kubernetes) with production deployments.
Cloud-native ML deployment experience (SageMaker, Azure ML, Vertex AI).
Proficiency in Python, Bash, and scripting for automation.
Familiarity with IaC tools (Terraform, CloudFormation, ARM).
Strong grasp of CI/CD for ML and data versioning (DVC, Delta Lake).
Understanding of monitoring & observability tools (Prometheus, Grafana, ELK, OpenTelemetry).
Preferred Qualifications
Cloud certifications (AWS, Azure, GCP).
Experience with model explainability tools (SHAP, LIME).
Exposure to deep learning deployment (TensorFlow Serving, TorchServe, ONNX).
Knowledge of API frameworks (FastAPI, Flask) for ML inference services.
Experience with real-time streaming integrations (Kafka, Kinesis, Pub/Sub)