¡Activa las notificaciones laborales por email!

Senior MLOps Platform Architect (AWS | Kubernetes | Terraform)

theHRchapter

Málaga

A distancia

EUR 50.000 - 70.000

Jornada completa

Hoy
Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A leading HR solutions provider is seeking a Senior MLOps / DevOps engineer to design and build the infrastructure for AI platforms. The ideal candidate will have over 5 years of experience, particularly with AWS, Kubernetes, and CI/CD pipelines, and will work in a dynamic remote-first environment, focusing on innovative AI solutions.

Servicios

Competitive fixed compensation
20+ days paid time off
Apple gear
Training & development budget

Formación

  • 5+ years in a Senior DevOps, SRE, or MLOps Engineering role.
  • Strong experience with Kubernetes in production.
  • Hands-on expertise with Terraform for cloud infrastructure.

Responsabilidades

  • Design and build AWS-based AI/ML infrastructure.
  • Architect and operate production Kubernetes clusters.
  • Build automated training and deployment pipelines.

Conocimientos

AWS infrastructure management
Kubernetes cluster management
Terraform
CI / CD pipeline development
Python programming
ML model deployment

Herramientas

GitLab
Jenkins
Docker
Helm
MLflow
Grafana
Prometheus
Descripción del empleo

Your Strategic Partner for HR, Payroll & Headhunting Solutions

🚀 We are hiring a senior MLOps / DevOps / SRE hybrid who can build an entire AI platform infrastructure end-to-end. This is not a research role and not a standard ML Engineer role. If you haven’t designed production-grade MLOps infrastructure, haven’t built CI / CD for ML, or haven’t deployed ML workloads on Kubernetes at scale, this role is not a fit.

You will design, build, and own the AWS-based infrastructure, Kubernetes platform, CI / CD pipelines, and observability stack that supports our AI models (Agentic AI, NLU, ASR, Voice Biometrics, TTS). You will be the technical owner of MLOps infrastructure decisions, patterns, and standards.

Location : Remote - Europe (PL / ES / PT / CZ / CY)

Key Responsibilities
MLOps Platform Architecture (from scratch)
  • Design and build AWS-based AI / ML infrastructure using Terraform (required) .
  • Define standards for security, automation, cost efficiency, and governance.
  • Architect infrastructure for ML workloads, GPU / accelerators, scaling, and high availability.
Kubernetes & Model Deployment
  • Architect, build, and operate production Kubernetes clusters.
  • Containerize and productize ML models (Docker, Helm).
  • Deploy latency-sensitive and high-throughput models (ASR / TTS / NLU / Agentic AI).
  • Ensure GPU and accelerator nodes are properly integrated and optimized.
CI / CD for Machine Learning
  • Build automated training, validation, and deployment pipelines (GitLab / Jenkins).
  • Implement canary, blue-green, and automated rollback strategies.
  • Integrate MLOps lifecycle tools (MLflow, Kubeflow, SageMaker Model Registry, etc.).
Observability & Reliability
  • Implement full observability (Prometheus + Grafana).
  • Own uptime, performance, and reliability for ML production services.
  • Establish monitoring for latency, drift, model health, and infrastructure health.
Collaboration & Technical Leadership
  • Work closely with ML engineers, researchers, and data scientists.
  • Translate experimental models into production-ready deployments.
  • Define best practices for MLOps across the company.
Qualifications and Skills

We’re looking for a senior engineer with a strong DevOps / SRE background who has worked extensively with ML systems in production. The ideal candidate brings a combination of infrastructure, automation, and hands-on MLOps experience.

  • 5+ years in a Senior DevOps, SRE, or MLOps Engineering role supporting production environments.
  • Strong experience designing, building, and maintaining Kubernetes clusters in production.
  • Hands-on expertise with Terraform (or similar IaC tools) to manage cloud infrastructure.
  • Solid programming skills in Python or Go for building automation, tooling, and ML workflows.
  • Proven experience creating and maintaining CI / CD pipelines (GitLab or Jenkins).
  • Practical experience deploying and supporting ML models in production (e.g., ASR, TTS, NLU, LLM / Agentic AI).
  • Familiarity with ML workflow orchestration tools such as Kubeflow , Apache Airflow , or similar.
  • Experience with experiment tracking and model registry tools (e.g., MLflow , SageMaker Model Registry ).
  • Exposure to deploying models on GPU or specialized hardware (e.g., Inferentia , Trainium ).
  • Solid understanding of cloud infrastructure on AWS , including networking, scaling, storage, and security best practices.
  • Experience with deployment tooling (Docker, Helm) and observability stacks (Prometheus, Grafana).
Ways to Know You’ll Succeed
  • You enjoy building platforms from the ground up and owning technical decisions.
  • You’re comfortable collaborating with ML engineers, researchers, and software teams to turn research into stable production systems.
  • You like solving performance, automation, and reliability challenges in distributed systems.
  • You bring a structured, pragmatic, and scalable approach to infrastructure design.
  • Energetic and proactive individual, with a natural drive to take initiative and move things forward.
  • Enjoys working closely with people - researchers, ML engineers, cloud architects, product teams.
  • Comfortable sharing ideas openly, challenging assumptions, and contributing to technical discussions.
  • Collaborative mindset : you like to build together, not work in isolation.
  • Strong ownership mentality - you enjoy taking responsibility for systems end-to-end.
  • Curious, hands-on, and motivated by solving complex technical challenges.
  • Clear communicator who can translate technical work into practical recommendations.
  • Thrives in a fast-paced environment where you can experiment, improve, and shape how things are done.
What we offer
  • Competitive fixed compensation based on experience and expertise.
  • Work on cutting-edge AI systems used globall.
  • Dynamic, multi-disciplinary teams engaged in digital transformation.
  • Remote-first work model
  • Long-term B2B contract
  • 20+ days paid time off
  • Apple gear
  • Training & development budget
Our Core values at TheHRchapter
  • ️ Transparency : We believe in transparent and smooth recruitment processes. You will get feedback from us.
  • ️ Candidate experience : Perfect blend between automated and humanized recruitment processes. Don't hesitate to ask us for feedback, anytime.
  • ️ Talented pool : We bring highly-skilled motivated candidates to our clients. Our candidates match their company values and management style.
  • ️ Diversity and inclusion : There is no place for discrimination and intolerance. We care about diversity awareness and respect for any differences.
Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.