Enable job alerts via email!

Lead Software Engineer MLOps Platform

JPMorgan Chase & Co.

City of Westminster

On-site

GBP 80,000 - 110,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A global financial services firm based in the City of Westminster is seeking a Lead MLOps Platform Engineer. In this role, you will design and develop a scalable ML platform, build and maintain infrastructure for automated ML pipelines, and mentor your team. The ideal candidate has proficient coding skills in Java and Python, experience with MLOps tools, cloud technologies, and CI/CD processes. You will collaborate closely with data scientists to enhance model integration and deployment.

Qualifications

Proficiency in operating, supporting, and securing mission critical software applications.

Responsibilities

Design and develop a scalable ML platform to support model training, deployment, and monitoring.
Build and maintain infrastructure for automated ML pipelines, ensuring reliability and reproducibility.
Implement tools for model versioning, experiment tracking, and lifecycle management.
Develop systems for monitoring model performance, addressing data drift.
Collaborate with data scientists to devise model integration patterns.
Optimize resource utilization for training and inference workloads.

Skills

Java programming

Python programming

MLOps tools (e.g., MLflow, Amazon SageMaker)

Cloud technologies (AWS/Azure/GCP)

Containerization and orchestration (Docker, Kubernetes)

CI/CD tools (e.g., Jenkins, GitHub Actions)

Tools

Terraform

Ansible

Overview

As a Lead MLOps Platform Engineer at JPMorgan Chase within the Accelerator, you are the heart of this venture, focused on getting smart ideas into the hands of our customers. You have a curious mindset, thrive in collaborative squads, and are passionate about new technology. By your nature, you are also solution-oriented, commercially savvy and have a head for fintech. You thrive in working in tribes and squads that focus on specific products and projects - and depending on your strengths and interests, you'll have the opportunity to move between them.

Responsibilities

Design and develop a scalable ML platform to support model training, deployment, and monitoring
Build and maintain infrastructure for automated ML pipelines, ensuring reliability and reproducibility supporting different model frameworks and architectures
Implement tools and frameworks for model versioning, experiment tracking, and lifecycle management
Develop systems for monitoring model performance, addressing data drift and model drift
Collaborate with data scientists and engineers to devise model integration/deployment patterns and best practices
Optimize resource utilization for training and inference workloads
Design and implement a framework for effective testing strategies (unit, component, integration, end-to-end, performance, champion/challenger, etc)
Ensure platform compliance with data privacy, security, and regulatory standards
Mentor team members on platform design principles and best practices
Mentor other team members on coding practices, design principles, and implementation patterns that lead to high-quality maintainable solutions
Proficiency in coding in recent versions of Java and/or Python programming languages
Experience with MLOps tools and platforms (e.g., MLflow, Amazon SageMaker, Google VertexAI, Databricks, BentoML, KServe, Kubeflow)
Experience with cloud technologies (AWS/Azure/GCP) and distributed systems, web technologies and event driven architectures
Understanding of data versioning and ML models lifecycle management
Hands-on experience with CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI)
Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible)
Strong knowledge of containerization and orchestration tools (e.g. Docker, Kubernetes)

Qualifications

Proficiency in operating, supporting, and securing mission critical software applications

Preferred qualifications, capabilities and skills

Exposure to cloud-native microservices architecture
Familiarity with advanced AI/ML concepts and protocols, such as Retrieval-Augmented Generation (RAG), agentic system architectures, and Model Context Protocol (MCP)
Familiarity with model serving frameworks (e.g., TensorFlow Serving, FastAPI)
Exposure to feature stores (Feast, Databricks, Hopsworks, SageMaker, VertexAI)
Previous experience deploying & managing ML models is beneficial
Experience working in a highly regulated environment or industry

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions