Overview
As a Lead MLOps Platform Engineer at JPMorgan Chase within the Accelerator, you are the heart of this venture, focused on getting smart ideas into the hands of our customers. You have a curious mindset, thrive in collaborative squads, and are passionate about new technology. By your nature, you are also solution-oriented, commercially savvy and have a head for fintech. You thrive in working in tribes and squads that focus on specific products and projects - and depending on your strengths and interests, you'll have the opportunity to move between them.
Responsibilities
- Design and develop a scalable ML platform to support model training, deployment, and monitoring
- Build and maintain infrastructure for automated ML pipelines, ensuring reliability and reproducibility supporting different model frameworks and architectures
- Implement tools and frameworks for model versioning, experiment tracking, and lifecycle management
- Develop systems for monitoring model performance, addressing data drift and model drift
- Collaborate with data scientists and engineers to devise model integration/deployment patterns and best practices
- Optimize resource utilization for training and inference workloads
- Design and implement a framework for effective testing strategies (unit, component, integration, end-to-end, performance, champion/challenger, etc)
- Ensure platform compliance with data privacy, security, and regulatory standards
- Mentor team members on platform design principles and best practices
- Mentor other team members on coding practices, design principles, and implementation patterns that lead to high-quality maintainable solutions
- Proficiency in coding in recent versions of Java and/or Python programming languages
- Experience with MLOps tools and platforms (e.g., MLflow, Amazon SageMaker, Google VertexAI, Databricks, BentoML, KServe, Kubeflow)
- Experience with cloud technologies (AWS/Azure/GCP) and distributed systems, web technologies and event driven architectures
- Understanding of data versioning and ML models lifecycle management
- Hands-on experience with CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI)
- Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible)
- Strong knowledge of containerization and orchestration tools (e.g. Docker, Kubernetes)
Qualifications
- Proficiency in operating, supporting, and securing mission critical software applications
Preferred qualifications, capabilities and skills
- Exposure to cloud-native microservices architecture
- Familiarity with advanced AI/ML concepts and protocols, such as Retrieval-Augmented Generation (RAG), agentic system architectures, and Model Context Protocol (MCP)
- Familiarity with model serving frameworks (e.g., TensorFlow Serving, FastAPI)
- Exposure to feature stores (Feast, Databricks, Hopsworks, SageMaker, VertexAI)
- Previous experience deploying & managing ML models is beneficial
- Experience working in a highly regulated environment or industry