¡Activa las notificaciones laborales por email!

Site Reliability Developer 4

Oracle

Región Centro

Presencial

MXN 1,102,000 - 2,206,000

Jornada completa

Hoy

Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A leading cloud service provider is seeking a Site Reliability Engineer to join their team in Mexico. In this role, you'll enhance system reliability through automation and collaboration. Ideal candidates have strong Linux administration and Python skills, and experience with CI/CD processes. Join a vital team impacting thousands of users globally.

Formación

Advanced knowledge of Linux systems required.
Proficiency in Python focused on automation.
Familiarity with CI/CD processes essential.

Responsabilidades

Collaborate to ensure reliability across services.
Design and deploy automation tools for availability.
Lead post-incident reviews and capacity planning.

Conocimientos

Advanced Linux systems administration

Strong coding skills in Python

Intermediate experience with Bash/Shell scripting

Familiarity with networking principles

Basic knowledge of databases

Understanding of unit testing

Experience with CI/CD pipelines

Comfortable in Agile environments

Overview

As part of the Site Reliability Engineering (SRE) team, you’ll contribute to designing, automating, and evolving mission-critical systems. You'll combine deep systems expertise with modern software engineering practices to reduce operational toil and build resilient, self-healing services.

This is a high-impact role where your work directly affects the reliability of cloud services used by thousands of customers around the world.

Responsibilities

What You’ll Do:

Collaborate with SRE and development teams to ensure end-to-end reliability across a wide range of services and technology stacks.
Design, write, and deploy software and automation tools that enhance availability, observability, and scalability.
Own and evolve metrics, SLOs, SLAs, KPIs, and dashboards that track system health and customer experience.
Build tooling to reduce manual operations and eliminate sources of toil.
Improve CI/CD pipelines, deployment processes, and validation frameworks for reliability and efficiency.
Review and influence architectural designs for distributed systems with a focus on resilience, performance, and fault tolerance.
Lead and participate in post-incident reviews, capacity planning, and production-readiness assessments.
Provide on-call support on a rotational basis (12-hour shifts, 7-day coverage).

What We’re Looking For

Advanced Linux systems administration
Strong coding skills in Python (automation-focused)
Intermediate experience with Bash/Shell scripting
Familiarity with networking principles and distributed systems behavior
Basic to intermediate knowledge of databases (e.g., SQL, NoSQL)
Understanding of unit testing and modern software engineering practices
Experience with CI/CD pipelines and deployment automation
Comfortable working in Agile development environments

Nice to Have

Exposure to monitoring/observability tools (e.g., Prometheus, Grafana, New Relic)
Experience building internal tools for operational efficiency
Participation in SRE culture: blameless postmortems, runbooks, and service design reviews

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.

Site Reliability Developer 4

Oracle

Región Centro

Presencial

MXN 1,102,000 - 2,206,000

Jornada completa

Descripción de la vacante

Formación

Responsabilidades

Conocimientos

Empresa

Servicios

Recursos gratuitos

Ayuda

Site Reliability Developer 4

Oracle

Región Centro

Presencial

MXN 1,102,000 - 2,206,000

Jornada completa

Descripción de la vacante

Formación

Responsabilidades

Conocimientos

Síguenos

Empresa

Servicios

Recursos gratuitos

Ayuda