Job Search and Career Advice Platform

Ativa os alertas de emprego por e-mail!

Senior Site Reliability Engineer Feegow (100 remote friendly in Brazil)

Docplanner

Teletrabalho

BRL 120.000 - 160.000

Tempo integral

Hoje
Torna-te num dos primeiros candidatos

Cria um currículo personalizado em poucos minutos

Consegue uma entrevista e ganha mais. Sabe mais

Resumo da oferta

A digital healthcare platform in Brazil is searching for a Senior Site Reliability Engineer to lead initiatives aimed at improving platform scalability, reliability, and security. The role requires optimizing infrastructure and ensuring the operational excellence of systems. Candidates should have strong experience in DevOps and leadership skills, specifically in AWS, Kubernetes, and CI/CD practices. The position offers full-time employment with benefits such as medical insurance and compensatory time off.

Serviços

Food / Market Voucher
Medical, Dental, and Group Life Insurance
Pet Plan
Gympass for you and up to 3 people
Stock Options after 6 months
Birthday Day Off

Qualificações

  • Strong hands-on experience in DevOps SRE or Platform Engineering.
  • Proven ability to lead infrastructure and reliability improvements in complex systems.
  • Advanced skills in Kubernetes AWS Terraform CI / CD and observability tooling.
  • Experience with service ownership incident management and root cause analysis.
  • Ability to influence teams and drive DevOps best practices.

Responsabilidades

  • Lead and guide the evolution of our infrastructure and deployment pipelines.
  • Support the engineering teams with architectural decisions and practices.
  • Manage infrastructure using AWS, Kubernetes, ArgoCD, and Terraform.
  • Improve the monitoring and alerting culture using tools like Datadog.
  • Champion DevSecOps practices and ensure security compliance.

Conhecimentos

Kubernetes
CI / CD
AWS
Terraform
Observability tooling
Security best practices
Incident management
Docker
Problem solving
Communication

Formação académica

Proficiency with infrastructure as code tools (Terraform)

Ferramentas

DataDog
Descrição da oferta de emprego

We are looking for a Senior Site Reliability Engineer to play a key role in the evolution of our platform. You will be responsible for leading initiatives that improve the scalability reliability observability and security of our systems. This role goes beyond just maintaining infrastructure were looking for someone to raise the engineering bar proactively identify bottlenecks and unlock the autonomy of the entire engineering team.

By optimizing our infrastructure and maintaining system reliability you will ensure that our digital healthcare platform operates smoothly and effectively. This will contribute to the overall user experience which is vital to our mission of making healthcare accessible and efficient. Your role will involve implementing security policies ensuring that our users data is safe and protected which is crucial to maintain trust in our services.

Youll act as a technical reference working closely with the DevOps Manager at Feegow and the global PMS Platform Team at Doctoralia contributing not only to day‑to‑day operations but to the strategic vision of our DevOps practices.

In this role you will :
  • Lead and guide the evolution of our infrastructure and deployment pipelines.
  • Support the engineering teams with architectural decisions and production‑readiness practices.
  • Work on complex and high‑impact projects including our journey to a global infrastructure.
  • Act as a driver of change fostering a culture of observability automation and operational excellence.
Collaboration & Leadership
  • Act as the go‑to person for DevOps topics across engineering squads.
  • Support and mentor engineers on platform engineering principles CI / CD observability incident response and production reliability.
  • Partner with SRE platform and product teams to unlock delivery and reliability goals.
  • Contribute to architectural discussions and ensure systems are designed with scalability security and operational readiness in mind.
  • Actively support and mentor other team members in platform‑related topics (CI / CD automation dockerization etc.) to increase teams autonomy.
  • Proactively cooperate with SREs during investigations on the efficiency and reliability of production systems.
Infrastructure & Automation
  • Manage infrastructure using AWS Kubernetes (EKS) ArgoCD and Terraform.
  • Evolve and maintain CI / CD pipelines to support fast and safe deployments.
  • Automate repetitive tasks and proactively improve system resilience.
Observability & Incident Management
  • Improve the monitoring and alerting culture within engineering teams using tools like Datadog.
  • Lead post‑mortems and drive follow‑ups from incidents ensuring continuous improvement.
  • Ensure SLAs SLOs and system health indicators are well defined and visible.
Monitoring
  • Monitor system performance and troubleshoot issues to ensure high availability and reliability.
  • Ensure theres necessary alerting around your teams systems.
Security & Compliance
  • Champion DevSecOps practices proactively identifying and mitigating risks.
  • Support the enforcement of security baselines and compliance across systems.
Expectations
  • Strong hands‑on experience in DevOps SRE or Platform Engineering.
  • Proven ability to lead infrastructure and reliability improvements in complex systems.
  • Advanced skills in Kubernetes AWS Terraform CI / CD and observability tooling.
  • Experience with service ownership incident management and root cause analysis.
  • Ability to influence teams and drive DevOps best practices in a growing organization.
  • Proactive mindset sense of urgency and strong communication skills.
Qualifications
  • Proficiency with infrastructure as code tools like Terraform.
  • Experience with containerization and orchestration tools particularly Docker and Kubernetes.
  • Understanding of AWS and its services.
  • Familiarity with CI / CD tools such as ArgoCD or similar.
  • Hands‑on experience with DataDog (how to analyze production‑running systems understanding of metrics and monitoring capabilities).
  • Excellent problem‑solving and troubleshooting skills.
  • Good communication in English (B2‑level) to cooperate with worldwide peers.
  • Understanding of security best practices and compliance requirements.
Nice to Have
  • Experience in regulated environments (e.g. healthcare finance).
  • Experience mentoring junior engineers and fostering DevOps culture across teams.
  • Exposure to multi‑region multi‑cloud or hybrid infrastructure scenarios.
Additional Information
  • Working hours are from Monday to Friday from 9 am to 6 pm;
  • We have compensatory time off (Banco de Horas);
  • Food / Market Voucher;
  • Medical Dental and Group Life Insurance;
  • Pet Plan;
  • iFeel app for emotional comfort;
  • Gympass for you and up to 3 people!
  • Creditas : Payroll loan services eligible after 6 months of employment;
  • Stock Options - eligible after 6 months of employment (5 years grace period) -
  • Birthday Day Off;
  • Daycare Assistance;
  • Partnership Club with discounts ranging from teaching institutions such as colleges and language learning services;
  • Referral Program offers up to R$600 per person who stays with us for more than 6 months;
  • Leave of Absence / Time‑off : in the event of the passing of loved ones we offer 10 days off ; if your pet passes away we offer 2 days. Got married 7 days of rest! Did the baby arrive We offer 30 days for Dads and 6 months for Moms;

Remote Work : Yes

Employment Type : Full-time

Key Skills
  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

Experience : years

Vacancy : 1

Obtém a tua avaliação gratuita e confidencial do currículo.
ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.