¡Activa las notificaciones laborales por email!

Site Reliability Engineer (Middle) ID38916

AgileEngine, LLC

Argentina

Presencial

ARS 12.500.000 - 15.000.000

Jornada completa

Hace 22 días

Descripción de la vacante

A leading software development company in Argentina is seeking an AWS Cloud Engineer to manage and support a critical SaaS infrastructure. The ideal candidate will have 2+ years of experience, strong skills in Datadog, EKS, Terraform, and Docker, and excellent communication abilities. Join a dynamic team that values a people-first culture and offers opportunities for growth and impact.

Formación

  • 2+ years of professional experience.
  • Good understanding of AWS IAM roles and policies.
  • Experience logging and monitoring AWS resources using CloudWatch logs.
  • Working Experience with monitoring solutions, such as Grafana and Prometheus.
  • Upper-Intermediate English level.

Responsabilidades

  • Manage alerts daily, check systems, and escalate issues as needed.
  • Be part of a team that provides 24x7 on-call support.
  • Document issues and remediation steps.
  • Deploy to EKS/K8s cluster using Terraform and Helm.
  • Improve existing infrastructure health.

Conocimientos

Experience working with Datadog
AWS Cloud Engineer
EKS/Terraform/Helm
Docker and Docker Swarm
Bash and/or Python scripting
REST APIs
Monitoring solutions (Grafana, Prometheus)
Excellent communication skills
Descripción del empleo
  • What you will do
  • Shift: Monday - Thursday 8AM - 7PM PST (11AM - 10PM EST) with rotating on-call;
  • Manage alerts daily, check systems, and escalate issues as needed;
  • Be part of a team that provides 24x7 on-call support for critical SaaS events;
  • Be available in case of emergencies when team members are not available or need help;
  • Document issues and remediation steps;
  • Proactively create appropriate monitors in the EKS/K8S ecosystem;
  • Deploy to EKS/K8s cluster using Terraform and Helm;
  • Learn and maintain existing infrastructure running under Docker Swarm;
  • Improve existing infrastructure health by implementing checks and scripts to correct known issues;
  • Maintain and develop deployment code;
  • Automate manual tasks;
  • Implement/integrate new technologies in our Cloud Infrastructure;
  • Collaborate with other teams and departments to provide the highest level of support and assistance;
  • Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes;
  • Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers;
  • Perform RCA and take necessary corrective actions to prevent the recurrence of issues;
  • Create and assign alert-related actions to the appropriate team after the investigation;
  • Handle support requests for environment-specific actions;
  • Identify and provide automation requirements to improve RCA.
  • Must haves
  • 2+ years of professional experience;
  • Experience working with Datadog;
  • Hands-on experience as an AWS Cloud Engineer;
  • Working knowledge of EKS/Terraform/Helm;
  • Working Experience with Docker and Docker Swarm;
  • Good understanding of AWS IAM roles and policies;
  • Experience logging and monitoring AWS resources using CloudWatch logs;
  • Experience working in a Linux environment;
  • Proficient in Bash and/or Python scripting;
  • A strong understanding of web technologies such as REST APIs;
  • Working Experience with monitoring solutions, such as Grafana and Prometheus;
  • Excellent oral and written communication skills;
  • Customer-facing communication skills to effectively explain issues and RCAs to them;
  • Experience in Product/Application Support for SaaS-based products;
  • Understanding of APIs, Databases, Systems Architecture, and Design;
  • Designing, implementing, and operating in a DevSecOps;
  • Excellent communication skills, both written and verbal;
  • Ability to work independently as well as within a collaborative environment;
  • A technical aptitude with the desire to learn new and evolving technologies;
  • Upper-Intermediate English level.

AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.

If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you! :)

38916
Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.