About the role
As a Senior ML Platform Engineer, you will take ownership of designing, building, and maintaining the infrastructure that powers our Generative AI framework. You will architect scalable, secure ML environments across development, testing, and production, while automating the entire ML lifecycle—from data ingestion and model training to deployment and monitoring.
Your Responsibilities
- Design, build, and maintain scalable and secure ML infrastructure across development, testing, and production environments
- Automate and optimize the ML lifecycle
- Architect and manage the continuous integration and deployment pipelines and release processes using tools such as Kubeflow, MLflow, or custom Kubernetes solutions
- Implement monitoring systems for data drift, model performance, and infrastructure health
- Develop tooling: Build and enhance ML engineering tooling for Model Development, Model Workbench, Model Training, Model Monitoring, and Model serving
- Work closely with data scientists and ML engineers to ensure reproducibility, scalability, and production-readiness of models
- Design and maintain pipelines for feature extraction, transformation, and storage using tools like Feature Store or custom solutions
- Ensure data quality, consistency, and lineage throughout the ML pipeline
- Ensure responsible use of data, model explainability, and auditability in line with organizational and legal standards
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
- 5+ years in DevOps/software engineering/infrastructure, with 2–3 years in MLOps or production ML systems
- Proficiency in Python, Bash or Go is a plus
- Model versioning, monitoring, and lifecycle management
- Strong with ML tooling (MLflow, Kubeflow, Airflow, SageMaker, etc.)
- Infrastructure-as-Code (Terraform; Pulumi plus) and CI/CD pipelines (GitLab CI, Jenkins, ArgoCD)
- Cloud platforms (AWS, GCP, or Azure)
- Proven track record with GPU-accelerated systems at scale
- Experience with cluster/cloud compute technologies (SLURM, Lustre, k8s)
- Benchmarking expertise (software & hardware)
- Strong problem‑solving/debugging with the ability to lead infra initiatives independently
- Excellent communication and cross‑functional collaboration
What we offer
- A dynamic, highly, and diverse team in which your contributions are reflected directly in our products and used by our international customer base
- Flat hierarchies and short decision‑making processes
- Exciting and varying tasks for our product portfolio
- Excellent working environment, modern office space, and flexible working hours with the option of mobile working
We are proud of our diversity and welcome your application regardless of gender and sexual identity, nationality, ethnicity, religion, age, or disability.