Enable job alerts via email!

Senior MLOps Engineer [UAE Based]

ZipRecruiter

London

On-site

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Senior MLOps Engineer to lead the development and management of infrastructure for training and deploying ML models. The role involves collaboration with cross-functional teams to integrate machine learning models into scalable production pipelines. Candidates should have a strong background in MLOps and cloud services, particularly AWS.

Benefits

Relocation Package
Competitive Compensation Package
Lifestyle Benefits

Qualifications

  • Minimum 5 years of experience in MLOps or ML infrastructure.
  • Proven experience in deploying large-scale models in production.

Responsibilities

  • Lead the deployment and scaling of LLMs and other deep learning models.
  • Design and maintain automated pipelines for model finetuning and delivery.

Skills

MLOps
Machine Learning Engineering
Collaboration
Programming

Education

Bachelor’s or Master’s degree in Computer Science

Tools

MLflow
Kubeflow
SageMaker
Docker
Kubernetes

Job description

Job Description

Role: Senior MLOps Engineer

Location: Abu Dhabi, UAE (Full Relocation Provided)

Company: AI71

About Us

AI71 is an applied research team committed to building responsible and impactful AI

agents that empower knowledge workers. In partnership with the Technology Innovation

Institute (TII), we drive innovation through cutting-edge AI research and development. Our

mission is to translate breakthroughs in machine learning into transformative products

that reshape industries.

Senior MLOps Engineer

AI71 is seeking a Senior MLOps Engineer to lead the development and management of our

infrastructure, designed for training, deploying, and maintaining ML models. This role plays

a critical function in operationalizing state-of-the-art systems to ensure high-performance

delivery across research and production environments.

The successful candidate will be responsible for designing and implementing

infrastructure to support efficient model deployment, inference, monitoring, and

retraining. This includes close collaboration with cross-functional teams to integrate

machine learning models into scalable and secure production pipelines, enabling the

delivery of real-time, data-driven solutions across various domains.

Key Responsibilities

• Model Deployment: Lead the deployment and scaling of LLMs and other deep

learning models using inference engines such as vLLM, Triton, or TGI, ensuring

optimal performance and reliability.

• Pipeline Engineering: Design and maintain automated pipelines for model finetuning, evaluation, versioning, and continuous delivery using tools like MLflow,

SageMaker Pipelines, or Kubeflow.

• Infrastructure Management: Architect and manage cloud-, cost-effective

infrastructure for machine learning workloads using AWS (SageMaker, EC2, EKS,

Lambda) or equivalent platforms.

• Performance Optimization: Implement monitoring, logging, and optimization

strategies to meet latency, throughput, and availability requirements across ML

services.

• Collaboration: Work closely with ML researchers, data scientists, and engineers to

support experimentation workflows, streamline deployment, and translate research

prototypes into production-ready solutions.

• Automation & DevOps: Develop infrastructure-as-code (IaC) solutions to support

repeatable, secure deployments and continuous integration/continuous delivery

(CI/CD) for ML systems.

• Model Efficiency: Apply model optimization techniques such as quantization,

pruning, and multi-GPU/distributed inference to enhance system performance and

cost-efficiency.

Qualifications

• Professional Experience: Minimum 5 years of experience in MLOps, ML

infrastructure, or machine learning engineering, with a strong record of managing

end-to-end ML model lifecycles.

• Deployment Expertise: Proven experience in deploying large-scale models in

production environments with advanced inference techniques.

• Cloud Proficiency: In-depth expertise in cloud services (preferably AWS), including

infrastructure management, scaling, and cost optimization for ML workloads.

• Programming Skills: Strong programming proficiency in Python, with additional

experience in C/C++ for performance-sensitive applications.

• Tooling Knowledge: Proficiency in MLOps frameworks such as MLflow, Kubeflow,

or SageMaker Pipelines; familiarity with Docker and Kubernetes.

• Optimization Techniques: Hands-on experience with model performance

optimization techniques and distributed training frameworks (e.g., DeepSpeed,

FSDP, Accelerate).

• Educational Background: Bachelor’s or Master’s degree in Computer Science,

Machine Learning, Data Engineering, or a related technical field.

Why Join AI71?

• Advanced Technology Stack: Work with some of the most capable large

models and cutting-edge ML infrastructure.

• High-Impact Work: Contribute directly to the deployment of AI solutions that

deliver measurable business value across industries.

• Collaboration-Driven Environment: Engage with a high-performing,

interdisciplinary team focused on continuous innovation.

• Robust Infrastructure: Access high-performance compute resources to support

experimentation and scalable deployment.

• Relocation Package: Full support for relocation to Abu Dhabi, with a competitive

compensation package and lifestyle benefits

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.