Aktiviere Job-Benachrichtigungen per E-Mail!

Senior Site Reliability Engineer - AI Platform

ZipRecruiter

Berlin

Vor Ort

EUR 70.000 - 100.000

Vollzeit

Vor 5 Tagen
Sei unter den ersten Bewerbenden

Erhöhe deine Chancen auf ein Interview

Erstelle einen auf die Position zugeschnittenen Lebenslauf, um deine Erfolgsquote zu erhöhen.

Zusammenfassung

A leading company in digital banking is looking for a Senior Site Reliability Engineer to enhance their AI Platform infrastructure. This role involves technical leadership in cloud solutions, collaboration across teams, and a focus on mentoring. The ideal candidate will have substantial experience in cloud infrastructure and a passion for innovative AI solutions. Benefits include career growth opportunities, workplace flexibility, and support for diversity and inclusion.

Leistungen

Career growth opportunities
Personal development budgets
Work-from-home allowances
Wellness discounts
Relocation support
Visa assistance

Qualifikationen

  • Hands-on experience in cloud infrastructure, specifically AWS.
  • Expertise in AI/ML workload orchestration.
  • Familiarity with tools like AWS SageMaker and GitHub Actions.

Aufgaben

  • Design and implement platform solutions for reliability and scalability.
  • Provide technical leadership for AI and MLOps workloads.
  • Mentor team members and drive incident management.

Kenntnisse

Cloud infrastructure design
AI/ML orchestration
Infrastructure as code (Terraform, CloudFormation)
Python
Networking best practices
MLOps tools
CI/CD pipelines

Jobbeschreibung

Job Description

About the opportunity

We are seeking a Senior Site Reliability Engineer to join the Platform Engineering Domain in the AI Platform Team.

The mission of Platform Engineering is to provide trusted, performant, self-service platforms that empower product teams to build 'the bank the world loves to use.' The AI Platform team contributes to this mission by creating scalable, secure, and compliant infrastructure solutions that support MLOps and GenAI capabilities.

The ideal candidate is a seasoned SRE expert ready to apply their skills to AI infrastructure challenges and an enthusiastic learner eager to grow with a team pioneering cutting-edge platform solutions. If you thrive where expertise meets curiosity, mentorship, and innovation, we'd love to hear from you.

Who we are

N26 has reimagined banking for today's digital world. Technology and design empower everything we do, building the global banking platform the world loves to use.

We've eliminated physical branches, paperwork, and hidden fees for an elegant digital experience and savings. Giving people the power to live and bank their way inspires our work.

Headquartered in Berlin with offices across Europe, including Vienna and Barcelona, our 1,500-strong team is diverse and innovative.

Responsibilities
  1. Design, develop, and implement platform solutions to enhance reliability, security, and scalability of the AI Platform infrastructure.
  2. Provide technical leadership in cloud infrastructure, networking, CI/CD, and security for AI and MLOps workloads.
  3. Collaborate with Data Scientists, ML Engineers, and Product Teams for seamless model deployment and operational efficiency.
  4. Mentor and coach team members, fostering knowledge sharing and continuous improvement.
  5. Shape team strategy, roadmap, and architecture.
  6. Drive incident management and troubleshooting to ensure a stable AI environment.
  7. Improve observability and monitoring to meet performance and compliance standards.
Qualifications
  • Hands-on experience in designing, implementing, and maintaining cloud infrastructure, especially in AWS.
  • Expertise in orchestration for AI/ML workloads.
  • Experience with infrastructure as code (Terraform, CloudFormation, etc.).
  • Proficiency in Python or similar programming languages.
  • Knowledge of networking and security best practices in cloud environments.
  • Familiarity with MLOps tools (e.g., AWS SageMaker, Kubeflow, MLflow).
  • Experience with CI/CD pipelines (GitHub Actions, Jenkins, ArgoCD).
Nice to have
  • Experience with AI/ML production systems and scaling challenges.
  • Understanding of compliance and governance in AI/ML platforms.
  • Familiarity with observability tools (DataDog, Prometheus, Grafana).
Traits
  • Excellent collaboration and communication skills.
  • Strong ownership and proactive problem-solving.
  • Passion for building scalable, secure AI infrastructure.
  • Eagerness to learn and contribute to AI platform evolution.

Additional benefits include career growth opportunities, personal development budgets, work-from-home allowances, wellness discounts, and a supportive, diverse team environment. We also offer relocation support and visa assistance if needed.

Equal Opportunities

We value diversity and are committed to creating an inclusive environment for all applicants and employees. We encourage applications from all backgrounds and abilities and are dedicated to ensuring a respectful, harassment-free workplace.

Learn more about our commitment to equity and inclusion on our website.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.