¡Activa las notificaciones laborales por email!

Technology Lead SRE

Infosys

Ciudad de México

Presencial

MXN 200,000 - 400,000

Jornada completa

Hace 30+ días

Descripción de la vacante

Infosys is seeking an SRE Operations specialist in Mexico City. The role focuses on providing support for B2B applications, implementing automation, and ensuring the resilience of business processes. Candidates should have strong IT operations experience, analytical skills, and a solid understanding of SRE principles and observability tools.

Formación

  • 3-4+ years related experience in IT Operations SRE.
  • Understanding of different application architecture types.
  • Experience working as part of a SRE Operations team.

Responsabilidades

  • Diagnose any anomalies in production environments.
  • Identify gaps in proactive health checks.
  • Collaborate with the SRE orchestration team.

Conocimientos

Proactive issue identification
Analytical skills
Troubleshooting
Proficiency in scripting

Educación

Bachelor’s degree in computer science or related field

Herramientas

AppDynamics
ELK Stack
FullStory
Prometheus
Grafana

Descripción del empleo

Job Description

The SRE Operations specialist focuses on B2B applications support providing round the clock support to identify self healing automation and proactive health checks. They need to be specialized in Site Reliability Engineering (SRE) mode of operations and help to onboard applications to any SRE Orchestration framework for higher business resiliency. The resource needs to have strong IT operations experience, analytical skills and a mindset of proactive issue identification. This resource champions Site Reliability Engineering and collaborates with the customer and the business to troubleshoot issues to identify the root cause and opportunities for automation/proactive health checks. This resource should be able to investigate application code as needed. The SRE Ops needs to have good understanding of different architecture types - legacy/modern app and their logging mechanisms and got exposure to observability tools like APPD, ELK, FullStory etc. SRE Ops should be responsible as pro-active support engineer, diagnosing any anomalies and driving the necessary remediations across the teams involved. SRE Ops resource will work with existing L2 support team, understand production issues, participate & contribute to RCA. SRE Ops will identify gaps in proactive health checks, automate and implement self healing mechanism wherever needed and work with SRE orchestration team to bring readiness to on board SRE orchestration framework.


The SRE Ops applies SRE practices, including proactive and diagnostic operations, adheres to SRE principles, and implements new remediations according to industry-accepted best practices. In general, this resource serves as a subject matter expert, utilizing technical expertise to enable/onboard applications in SRE orchestration framework and improve business process resiliency.

Locations for this position are Mexico (Mexico City) or same location for all 3 resources

Qualifications Basic

  • Bachelor’s degree in computer science or related field with 3-4+ years related experience in IT Operations SRE platform/Service Cloud operations

Responsibilities:
  • Work with the existing L2 support team to understand production issues and actively participate in and contribute to Root Cause Analyses (RCAs).
  • Identify gaps in proactive health checks and implement new checks to detect potential issues before they impact production.
  • Automate and implement self-healing mechanisms wherever needed to minimize manual intervention and improve system resilience.
  • Collaborate with the SRE orchestration team to onboard and operationalize the SRE orchestration framework.
  • Diagnose anomalies in production environments and drive the necessary remediations across the teams involved.

Mandatory Skills:
  • Proven IT operations experience, with a focus on production support.
  • Strong analytical and problem-solving skills, with the ability to troubleshoot complex issues to identify root causes.
  • A mindset of proactive issue identification and prevention.
  • Ability to investigate application code (e.g., debugging, log analysis) to understand system behavior.
  • Understanding of different application architecture types (legacy and modern) and their logging mechanisms.
  • Exposure to observability tools such as AppDynamics (APPD), ELK Stack (Elasticsearch, Logstash, Kibana), and FullStory.
  • IT operations experience, analytical skills. A mindset of proactive issue identification.
  • Troubleshoot issues to identify the root cause and opportunities for automation/proactive health checks. Able to investigate application code as needed.
  • Understanding of different architecture types - legacy/modern app and their logging mechanisms
  • Exposure to observability tools like APPD, ELK, FullStory, Prometheus, Grafana
  • Responsible as pro-active support engineer, diagnosing any anomalies and driving the necessary remediations across the teams involved.
  • Proficiency in scripting
  • Knowledge in Version control

Nice-to-Have Skills:
  • Knowledge in Cloud platform –Azure/GCP
  • Knowledge in Kubernetes, Springboot, Python, Angular/react
  • Knowledge in SQL
  • Exposure to CI/CD pipelines
  • Networking concepts to diagnose the issue
  • Experience with SRE (Site Reliability Engineering) principles and practices.
  • Experience with SRE orchestration frameworks.
  • Knowledge of scripting languages (e.g., Python, Bash, PowerShell) for automation.
  • Experience with containerization technologies (e.g., Docker, Kubernetes).
  • Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible).
  • Experience with CI/CD pipelines.
  • Excellent communication and collaboration skills.

Other Relevant Experience
  • Proficient in English communication
  • Experience working as part of a SRE Operations team practicing SRE orchestration framework.
  • Experience and desire to work in a Global delivery environment
  • Ability to work in team in diverse/ multiple stakeholder environment

EEO/About Us

About Us
Infosys is a global leader in next-generation digital services and consulting. We enable clients in more than 50 countries to navigate their digital transformation. With over four decades of experience in managing the systems and workings of global enterprises, we expertly steer our clients through their digital journey. We do it by enabling the enterprise with an AI-powered core that helps prioritize the execution of change. We also empower the business with agile digital at scale to deliver unprecedented levels of performance and customer delight. Our always-on learning agenda drives their continuous improvement through building and transferring digital skills, expertise, and ideas from our innovation ecosystem.

EEO
Infosys provides equal employment opportunities to applicants and employees without regard to race; color; sex; gender identity; sexual orientation; religious practices and observances; national origin; pregnancy, childbirth, or related medical conditions; status as a protected veteran or spouse/family member of a protected veteran; or disability.Infosys provides equal employment opportunities to applicants and employees without regard to race; color; sex; gender identity; sexual orientation; religious practices and observances; national origin; pregnancy, childbirth, or related medical conditions; status as a protected veteran or spouse/family member of a protected veteran; or disability.

Country

Mexico

State / Region / Province

Mexico

Work Location

Mexico City

Interest Group

Infy Mexico

Domain

Retail ,CPG and logistics

Skillset

Technology|Infra_ToolAdministration-PerformanceManagement|AppDynamics

Company

Infosys Mexico

Role Designation

835ATECHLD Technology Lead

Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.