Enable job alerts via email!

Lead Software Engineer – Applied AI/ML Infrastructure – Cloud/Platform - (Kubernetes / AWS / Te[...]

J.P. Morgan

City Of London

On-site

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A leading global bank in London is seeking a Lead Software Engineer to support AI/ML infrastructure. You will design and operate resilient systems, collaborate with engineers and data scientists, and ensure effective automation of operational tasks. Ideal candidates have AWS and Kubernetes experience, along with strong communication skills. Competitive compensation is offered.

Qualifications

  • Bachelor’s degree or higher in Computer Science, Engineering, or related field.
  • Proven hands-on experience with AWS services.
  • Strong experience with Kubernetes.

Responsibilities

  • Design, deploy, and operate application infrastructure using Amazon EKS/ECS.
  • Build and maintain foundational data storage infrastructure.
  • Automate infrastructure provisioning and management.

Skills

AWS services
Kubernetes
Infrastructure as Code (IaC)
Collaboration skills
Automation

Education

Bachelor's degree in Computer Science, Engineering, or related field

Tools

Terraform
Amazon EKS
Amazon S3
Aurora Postgres
OpenSearch
Job description

As a Lead Software Engineer at JPMorgan Chase within the Corporate and Investment Banking Applied Artificial Intelligence and Machine Learning team, you will play a pivotal role in transforming the operations of the world's largest bank. You will collaborate with software engineers, data scientists, and line of business teams to develop and sustain resilient infrastructure for AI/ML solutions.

Job Responsibilities
  • Design, deploy, and operate application infrastructure using Amazon EKS/ECS.
  • Build and maintain foundational data storage infrastructure, including Aurora Postgres, OpenSearch, and Amazon S3.
  • Deploy and operate open-source AI/ML software, ensuring scalability, security, and operational efficiency.
  • Automate infrastructure provisioning and management using Terraform, Helm, Spinnaker, and related tools.
  • Implement and uphold resiliency best practices, including defining and meeting SLAs/SLOs.
  • Monitor and manage controls and hygiene alerts to maintain compliance and operational excellence.
  • Lead initiatives to promote best practices in infrastructure engineering and DevOps.
  • Collaborate closely with SRE and production monitoring teams to ensure system reliability, performance, and rapid incident response.
Required Qualifications, Capabilities, and Skills
  • Bachelor’s degree or higher in Computer Science, Engineering, or a related field, or equivalent formal training/certification.
  • Proven hands-on experience with AWS services (Aurora/RDS, EKS, ECS, VPC, IAM, S3).
  • Strong experience with Kubernetes (Amazon EKS).
  • Proficiency with Infrastructure as Code (IaC) tools, such as Terraform.
  • Experience automating operational tasks and CI/CD workflows.
  • Demonstrated ability to design and operate resilient, scalable, and secure infrastructure in production environments.
  • Excellent communication skills, with the ability to convey technical information clearly and build trust with stakeholders at all levels.
Preferred Qualifications, Capabilities, and Skills
  • Practical experience deploying LLM-based applications into production and an understanding of MLOps.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.