Enable job alerts via email!

Senior MLOps Engineer

Entrust Datacard

United Kingdom

Hybrid

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A leading identity security solutions provider seeks an experienced MLOps Engineer in the UK to enhance identity-verification platforms. The role involves improving a multi-tenant ML compute layer with AWS and Kubernetes, engaging with users to optimize development experiences. Candidates should possess strong Python skills and experience with data pipelines. This hybrid position offers a dynamic environment for innovation.

Benefits

Annual leave
Volunteering days
Health insurance

Qualifications

  • Must value developer experience and engage users.
  • Experience operating with GPU workloads required.
  • Solid understanding of performance and concurrency.

Responsibilities

  • Evolve the ML compute layer on Kubernetes for multi-tenant workloads.
  • Operate Argo Workflows and Dask Gateway as self-service tools.
  • Build GitOps-native delivery for ML jobs with fast rollouts.

Skills

Developer experience
Production experience with AWS
Kubernetes (EKS) experience
Proficiency in Python
Building data pipelines
Networking and security knowledge

Tools

Terraform
Docker
GitLab CI/CD
Job description

Join us at Entrust

At Entrust, we’re shaping the future of identity centric security solutions. From our comprehensive portfolio of solutions to our flexible, global workplace, we empower careers, foster collaboration, and build solutions that help keep the world moving safely.

Get to Know Us

Headquartered in Minnesota, Entrust is an industry leader in identity-centric security solutions, serving over 150 countries with cutting-edge, scalable technologies. But our secret weapon? Our people. It’s the curiosity, dedication, and innovation that drive our success and help us anticipate the future.

Senior MLOps Engineer

About the position

We’re looking for an experienced MLOps Engineer to build and operate the platform and tooling that powers our identity-verification products. You’ll join a team supporting Applied Scientists and Machine Learning Engineers across countries. Our mission is to accelerate the path from ML research to production by building intuitive platform abstractions that let engineers focus on model innovation rather than infrastructure complexity.

Responsibilities:

  • Run and evolve our ML compute layer on Kubernetes/EKS (CPU/GPU) for multi-tenant workloads, and make workloads portable across regions (region-aware scheduling, cross-region data access, and artifact portability).
  • Operate Argo Workflows and Dask Gateway as reliable, self-serve services used by engineers and researchers to orchestrate data prep, training, evaluation, and large-scale batch compute (installation, upgrades, security, quotas, autoscaling).
  • Build GitOps-native delivery for ML jobs and platform components (GitLab CI, Helm, FluxCD) with fast rollouts and safe rollbacks.
  • Design and maintain our data platform built on LakeFS to enable experiment reproducibility, data lineage tracking, and automated governance processes.
  • Own developer experience and enablement by creating clear APIs/CLIs and minimal UIs, maintaining comprehensive templates and documentation.

Requirements:

  • You value developer experience and enjoy talking to users (engineers/scientists), removing friction, and treating the platform like a product.
  • Production experience with AWS and Kubernetes (EKS), including GPU workloads.
  • Proficiency in Python (e.g., FastAPI/Django) and solid CS fundamentals (performance, concurrency, data structures).
  • Experience building/operating data pipelines (idempotency, retries, backfills, reproducibility).
  • Working knowledge of Terraform, Helm, Docker, Git, and GitLab CI/CD.
  • Observability experience with Prometheus/Grafana and logs (e.g., Loki/Promtail or Splunk/Sentry) with sensible alerting.
  • Good grasp of networking and security concepts and Linux systems administration.

Nice to have:

  • Experience with distributed compute frameworks such as Dask, Spark, or Ray.
  • Familiarity with NVIDIA Triton or other inference servers.
  • FinOps best practices and cost attribution for multi-tenant ML infrastructure.
  • Exposure to multi-region designs (dataset replication strategies, compute placement, and latency optimization).

Tech stack & environment:

  • Container Orchestration: Kubernetes (EKS)
  • Compute: Argo Workflows for orchestration and Dask for Distributed Computing
  • ML Experiment Tracking: Weights & Biases
  • Data (Lakehouse & Versioning): Apache Iceberg + AWS Athena, LakeFS, Snowflake
  • CI/CD & GitOps: GitLab CI, Helm, FluxCD
  • Infrastructure as Code: Terraform
  • Observability: Prometheus/Grafana, Loki/Promtail, Datadog, Sentry
  • Languages & Libraries: Python (Django, FastAPI, Pydantic, boto3)
  • AWS Services: S3, EC2, RDS/PostgreSQL, ECR, IAM, Lambda, Step Functions

Locations:

  • Paris, France - Hybrid or Remote
  • London, UK - Hybrid

Benefits:

Entrust offers a range of benefits, including annual leave, volunteering days, meal vouchers, health insurance, and more. For more information, please visit our website.

Entrust is an EEO/AA/Disabled/Veterans Employer

Entrust values diversity and inclusion and we are committed to building a diverse workforce with wide perspectives and innovative ideas. We welcome applications from qualified individuals of all backgrounds, and we strive to provide an accessible experience for candidates of all abilities.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.