Enable job alerts via email!

ML Engineer - Infrastructure

Convergence

London

On-site

GBP 60,000 - 100,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in AI technology is seeking a Machine Learning Engineer to design and implement cutting-edge ML infrastructures. The role involves collaboration with data scientists and ML engineers to create robust solutions enhancing productivity and creativity. This position offers opportunities for growth and involvement in impactful projects as part of a dynamic team focused on transforming AI integration in daily life.

Benefits

Professional growth opportunities
Collaborative work environment
Competitive salary and benefits

Qualifications

  • 3+ years of experience in ML infrastructure or platform engineering.
  • Strong proficiency in Python for ML pipeline automation.
  • Extensive experience with Slurm cluster management.

Responsibilities

  • Design and maintain ML-focused cloud infrastructure on GCP.
  • Build and manage HPC clusters for distributed ML workloads.
  • Implement monitoring solutions for ML model performance.

Skills

Machine Learning Infrastructure
Python
HPC Clusters Management
Monitoring and Logging
Problem Solving

Tools

Terraform
Slurm
GCP

Job description

1 month ago Be among the first 25 applicants

Direct message the job poster from Convergence

At Convergence, we're transforming the way AI integrates into our daily lives. Our team is developing the next generation of AI agents that don't just process information but take actions, learn from experience, and collaborate with humans. By introducing Large Meta Learning Models (LMLMs) that integrate memory as a core component, we're enabling AI to improve continuously through user feedback and acquire new skills during real-time use.

We believe in freeing individuals and businesses from mundane, repetitive tasks, allowing them to focus on innovative and creative work that truly matters. Our personalised AI assistant, proxy, collaborates with users to enhance productivity and creativity. With a $12 million pre-seed funding from Balderton Capital, Salesforce Ventures, and Shopify Ventures, we're poised to make a significant impact in the AI space. Join us in shaping the future of human-AI collaboration and be part of our mission to transform the AI landscape.

Responsibilities

  • Design, implement, and maintain our ML-focused cloud infrastructure on GCP using Infrastructure as Code (Terraform)
  • Build and manage HPC clusters with Slurm for distributed ML workloads, focusing on GPU/TPU utilization and job scheduling
  • Develop and maintain ML pipeline automation tools and ML-specific CI/CD workflows in Python
  • Design and optimize data storage solutions for ML datasets, model artifacts, and feature stores
  • Implement comprehensive monitoring, logging, and alerting solutions for ML model performance and infrastructure health
  • Collaborate with ML engineers and data scientists to provide robust infrastructure for model training and deployment
  • Lead and implement security best practices for ML systems, including model security and data protection

Requirements

  • 3+ years of experience in ML infrastructure or ML platform engineering
  • Strong proficiency in Python for ML pipeline automation and tooling
  • Extensive experience with Slurm cluster management for large-scale ML workloads
  • Proven track record with Terraform and Infrastructure as Code for ML environments
  • Solid understanding of GCP's ML-specific services (Vertex AI, AI Platform, etc.)
  • Experience with distributed training systems and model serving infrastructure
  • Experience with ML observability tools and performance monitoring
  • Excellent problem-solving skills with a focus on ML system reliability and optimization

Bonus Qualifications

  • Knowledge of ML-specific orchestration tools (e.g., MLflow, Ray)
  • Experience with high-performance computing for ML training
  • Contributions to ML infrastructure-related open-source projects
  • Experience with GPU/TPU cluster management and optimization
  • Background in ML operations (MLOps) or AI reliability engineering
  • Familiarity with vector databases and efficient embedding storage/retrieval

Why Join Us?

  • Be at the cutting edge of AI and LLM technology
  • Work on challenging problems that impact users' daily lives
  • Collaborative and innovative work environment
  • Opportunities for professional growth and learning
  • Competitive salary and benefits package
Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Information Technology
  • Industries
    Research Services

Referrals increase your chances of interviewing at Convergence by 2x

Get notified about new Machine Learning Engineer jobs in London Area, United Kingdom.

London, England, United Kingdom 1 week ago

London, England, United Kingdom 1 week ago

London, England, United Kingdom 3 days ago

Graduate Software Engineer – ML Data Platform

London, England, United Kingdom 1 month ago

Greater London, England, United Kingdom 5 days ago

London, England, United Kingdom 1 month ago

London, England, United Kingdom 1 day ago

London, England, United Kingdom 3 days ago

London, England, United Kingdom 1 day ago

London, England, United Kingdom 3 weeks ago

Machine Learning Engineer - Up to £150k + Equity

London, England, United Kingdom 2 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Member of Technical Staff, Agent Infrastructure Engineer

Cohere

London

Remote

GBP 50,000 - 90,000

30+ days ago

Member of Technical Staff, Training Infra Engineer

Cohere

London

Remote

GBP 50,000 - 90,000

30+ days ago

29136 - 3rd Line Infrastructure Support Engineer

TN United Kingdom

Basingstoke

On-site

GBP 80,000 - 100,000

10 days ago

⚙️ Infrastructure Engineer London, UK

Granola inc

London

On-site

GBP 60,000 - 80,000

28 days ago

Senior Software Engineer (Infrastructure)

ZipRecruiter

London

On-site

GBP 70,000 - 90,000

27 days ago

Lead Machine Learning Engineer (Agentic Infrastructure)

JR United Kingdom

Slough

Hybrid

GBP 70,000 - 100,000

11 days ago

Lead Machine Learning Engineer (Agentic Infrastructure)

JR United Kingdom

London

Hybrid

GBP 70,000 - 100,000

19 days ago

ML Infrastructure Engineer

Millennium

London

On-site

GBP 60,000 - 100,000

30+ days ago

PRINCIPAL MACHINE LEARNING INFRASTRUCTURE ENGINEERS-AEROSPACE AND DEFENSE

Gentrian

London

On-site

GBP 70,000 - 120,000

30+ days ago