Aktiviere Job-Benachrichtigungen per E-Mail!

Senior DevOps Engineer – Remote

Replika

Berlin

Remote

EUR 70.000 - 100.000

Vollzeit

Vor 10 Tagen

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Starte ganz am Anfang oder importiere einen vorhandenen Lebenslauf

Zusammenfassung

A leading company in AI technology seeks a Senior DevOps Engineer for its remote-first team. This high-impact role focuses on building reliable cloud infrastructure to support a rapidly growing AI platform. Responsibilities include designing scalable systems and collaborating across engineering teams. Ideal candidates will have a passion for AI and extensive experience in DevOps tools and practices.

Leistungen

Competitive compensation
Freedom to work remotely
Team offsites in various countries
High-responsibility environment

Qualifikationen

  • 5+ years of hands-on experience in DevOps and cloud infrastructure.
  • Strong expertise in multi-cloud and hybrid environments.
  • Experience with MLOps tooling and CI / CD pipelines.

Aufgaben

  • Design and maintain scalable infrastructure for AI applications.
  • Automate deployment and monitoring using modern DevOps tools.
  • Collaborate with AI teams to streamline CI / CD pipelines.

Kenntnisse

DevOps
Cloud Infrastructure
Site Reliability Engineering
MLOps
CI/CD Pipelines
Containerization
Orchestration
Monitoring AI Systems
GPU Clusters
Communication

Tools

AWS
GCP
Docker
Kubernetes
MLFlow
Kubeflow
DataRobot

Jobbeschreibung

An AI companion who is eager to learn and would love to see the world through your eyes. Replika is always ready to chat when you need an empathetic friend.

About Replika

Replika is an AI companion loved by 35M+ users worldwide. We're redefining what it means to connect with technology - emotionally, intelligently, and personally. From mobile to VR, we're building an experience that feels less like software and more like someone who gets you. Our team is mission-first, future-facing, and here to create something wonderful. We value agency, room for magic, and a relentless pursuit of good.

About the Role

We're looking for a Senior DevOps Engineer to join our globally distributed, remote-first team. This is a hands-on, high-impact role for someone who thrives in a fast-paced environment and is passionate about building scalable, reliable, and secure infrastructure for cutting-edge AI applications. You'll work closely with engineering, AI, and analytics teams to ensure our platform is robust, performant, and ready to support millions of users around the world.

What You'll Be Doing

  • Design, build, and maintain scalable infrastructure across cloud, on-premises, and hybrid environments to support our rapidly growing AI platform.
  • Support AI teams and MLOps workflows by implementing specialized tooling, monitoring, and deployment pipelines for machine learning models.
  • Automate deployment, monitoring, and scaling of services using modern DevOps tools and practices across diverse infrastructure environments.
  • Ensure high availability, reliability, and security of production and staging environments in multi-cloud and hybrid setups.
  • Collaborate with AI and backend engineers to streamline CI / CD pipelines optimized for ML workflows and bring new features to production.
  • Monitor system performance and troubleshoot issues proactively, implementing solutions to prevent downtime across distributed infrastructure.
  • Drive infrastructure as code (IaC) initiatives to improve repeatability and reduce manual intervention across all deployment environments.
  • Implement and maintain monitoring, logging, and alerting systems specifically designed for AI workloads and model performance tracking.
  • Participate in on-call rotations and respond to production incidents with deep understanding of AI system requirements.

Who You Are

  • 5+ years of hands-on experience in DevOps, cloud infrastructure, or site reliability engineering.
  • Strong expertise in multi-cloud and hybrid infrastructure including AWS, GCP, and on-premises environments.
  • Experience with MLOps tooling such as MLFlow, Kubeflow, DataRobot, or similar platforms for ML lifecycle management.
  • Experience with containerization and orchestration (Docker, Kubernetes) specifically for ML workloads and GPU clusters.
  • Deep understanding of CI / CD pipelines for machine learning applications and model deployment automation.
  • Experience with specialized monitoring tools for AI systems including model performance tracking, data drift detection, and ML-specific alerting.
  • Understanding of GPU clusters, HPC environments, and specialized AI hardware deployment and management.
  • Excellent communication skills in English (B2 or higher preferred) with ability to translate technical concepts to stakeholders.
  • Passion for AI and technology , with deep curiosity about machine learning infrastructure and emerging AI technologies.

Bonus Points

  • Background in supporting data science teams and understanding of ML experimentation workflows.
  • Experience with edge computing and distributed AI inference infrastructure.
  • Previous startup experience building and scaling AI infrastructure from the ground up.
  • Knowledge of AI compliance and governance frameworks for production AI systems.

What You’ll Get

  • Competitive compensation
  • A chance to build a product that actually matters to millions of people
  • Freedom to work remotely with a globally distributed team
  • Offsites in different countries with people who actually like each other
  • A trustworthy, high-responsibility environment where your ideas really matter
Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.