¡Activa las notificaciones laborales por email!

SRE Engineer

RingCentral

Valencia

Presencial

EUR 50.000 - 70.000

Jornada completa

Hace 24 días

Descripción de la vacante

A global tech company is looking for a Site Reliability Engineer to manage cloud infrastructure and ensure reliable platform performance. Key responsibilities include overseeing AWS and EKS infrastructure, participating in software performance analysis, and implementing security best practices. Ideal candidates should have experience with Kubernetes, Terraform, and CI/CD pipelines. This position requires on-site presence at the office four days a week and offers a comprehensive benefits package including health insurance and vacation days.

Servicios

Additional Health and Life Insurance Package
Employee Assistance Program
25 vacation days

Formación

  • Experience with cloud-native services and architectures.
  • Hands-on experience with Kubernetes and IaC using Terraform.
  • Strong problem-solving skills in distributed systems.

Responsabilidades

  • Manage cloud infrastructure on AWS and EKS.
  • Participate in service capacity planning and software performance analysis.
  • Implement security best practices and controls.

Conocimientos

AWS
Kubernetes
Terraform
CI/CD pipelines
Monitoring tools
Problem-solving

Herramientas

GitLab CI
Redis
PostgreSQL
Descripción del empleo
Position Overview

As a Site Reliability Engineer for RingCentral Events, you're not just an infrastructure owner - you're a crucial part of our mission to deliver flawless, high-scale experiences for global audiences. Your role is central to our ability to deliver a reliable and performant platform. You will be a key contributor to our software delivery flow, ensuring that changes move from development to production with speed, safety, and consistency. Additionally, you will proactively eliminate observability gaps and build a self-healing infrastructure to ensure our system performs under pressure.

Responsibilities
  • Manage cloud infrastructure on AWS and EKS, leveraging IaC and GitOps to ensure scalability

  • Participate in service capacity planning, software performance analysis, and system tuning

  • Design, consult, re-platform, and re-factor the observability of current cloud infrastructure

  • Participate in release management, working closely with engineering teams to bring GitOps principles to our release process and manage CI/CD pipelines using GitLab CI

  • Take part in 24/7 on-call responsibilities (~2 days/month based on rotation schedule) to ensure continuous availability and quick response to issues in production

  • Conduct blameless post-mortems to learn from incidents and prevent future ones

  • Develop and test disaster recovery plans and runbooks to ensure business continuity

  • Implement security best practices and controls within the infrastructure to meet compliance standards and prepare for audits

Requirements
  • Familiarity with cloud-native services and architectures, experience with cloud providers - our infrastructure is built on AWS

  • Experience in running mission critical services at scale without disruption

  • Hands-on experience with Kubernetes and infrastructure as code (IaC) using Terraform, focusing on scalability and efficient infrastructure management

  • Proficiency in designing and maintaining CI/CD pipelines, with a preference for GitLab CI

  • Experience with monitoring, APM, logging, and analytics tools

  • Strong problem-solving skills with the ability to analyze and debug complex distributed systems, tracing requests and data flows from the kernel to the web to identify root causes

  • Ability to spot, address, and optimize performance bottlenecks

  • Proactive approach, favoring iterative action over waiting for things to happen or to be perfect

  • Familiarity with incident, problem and change management processes and best practices

Nice to have
  • A reliability-oriented mindset with a focus on designing and building resilient architectures

  • Previous SRE experience or knowledge, giving you a heightened awareness of what data to collect, how to display it, and how users can benefit from it

  • Knowledge of scripting languages such as Python or Go

  • Familiarity with GitOps principles and tools like ArgoCD

  • Knowledge of caching mechanisms, such as Redis

  • Experience with messaging queues like MSK Kafka, SQS or RabbitMQ

  • Familiarity with database management systems like AWS Aurora and PostgreSQL

We offer
  • Well-coordinated professional team

  • Cutting edge technologies, interesting and challenging tasks, dynamic project, great opportunities for self-realization, professional and career growth

  • Additional Health and Life Insurance Package

  • Employee Assistance Program

  • 25 vacation days

  • ReBenefit Platform Account.

  • This role requires on-site presence at our office 4 days a week to support effective collaboration and teamwork.

Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.