Ativa os alertas de emprego por e-mail!

Saas Resilience Manager-Brazil

buscojobs Brasil

São Paulo

Presencial

BRL 80.000 - 120.000

Tempo integral

Há 3 dias
Torna-te num dos primeiros candidatos

Resumo da oferta

A leading SaaS company in São Paulo is seeking a SaaS Resilience Manager to implement service resilience strategies and enhance cloud infrastructure. The ideal candidate will have a Bachelor's degree in Computer Science, proficiency in .NET, and experience with AWS infrastructure. This role includes building resilience teams, monitoring service disruptions, and ensuring compliance with industry standards, while offering health insurance and a commitment to diversity.

Serviços

Health insurance

Qualificações

  • Minimum 3 years coding experience required.
  • Interest in technology processes.
  • Willingness to learn and adapt to new technologies.

Responsabilidades

  • Develop and implement a comprehensive service resilience strategy.
  • Design disaster recovery and business continuity plans.
  • Monitor production environments with the VP of Development.

Conhecimentos

Proficiency in .NET framework
Strong knowledge of AWS and Azure
Collaborative skills
Effective communication skills
Problem-solving skills

Formação académica

Bachelor's degree in Computer Science or related field

Ferramentas

Git

Descrição da oferta de emprego

D.Engage is a leading SaaS company dedicated to delivering innovative solutions that drive digital engagement and enhance customer experiences. Our team is passionate about technology and committed to fostering an environment where talent can thrive and grow. Currently, we are looking for a SaaS Resilience Manager as part of our technology team, who is agile, results-driven, customer-obsessed, and loves learning!

This position offers a valuable opportunity for an engineer to enhance their expertise and contribute to impactful projects. Here are the responsibilities for this position:

Key Responsibilities:
  • Resilience Planning and Strategy:
    • Participate in developing and implementing a comprehensive service resilience strategy for all SaaS products.
    • Design and maintain disaster recovery and business continuity plans.
    • Conduct regular risk assessments and impact analyses to identify vulnerabilities and mitigate risks.
  • Ownership of Production Environment:
    • Take ownership and responsibility for the production environment, including cloud and on-premise infrastructure.
    • Monitor production environments in collaboration with the VP of Development.
    • Work with the VP of Security to ensure the security of the production environment.
  • Team Building and Improvement:
    • Build and lead a high-performing resilience team, continuously improving its quality.
    • Train and enhance the skills of technical support teams, including preparing training materials.
    • Provide feedback to teams on problem detection and troubleshooting steps (logging, monitoring, health checks).
  • Service Monitoring and Incident Management:
    • Establish and manage robust monitoring systems to detect and respond to service disruptions promptly.
    • Lead incident response efforts, including root cause analysis, resolution, and post-incident reviews.
    • Develop and maintain incident response playbooks and procedures.
  • Infrastructure and Performance Optimization:
    • Collaborate with IT and engineering teams to design resilient infrastructure and applications.
    • Implement redundancy, failover, and load balancing strategies to ensure high availability.
    • Continuously monitor and optimize system performance, capacity, and scalability.
  • Collaboration and Communication:
    • Assist product and development teams with analysis when necessary.
    • Analyze large-scale bugs and transfer them to relevant teams.
    • Troubleshoot server problems with teams when necessary.
    • Provide regular updates on service resilience status, metrics, and improvements to stakeholders.
    • Fix small-scale bugs (minimum 3 years coding experience required).
    • Analyze large-scale bugs and coordinate with relevant teams for resolution.
  • Compliance and Documentation:
    • Ensure compliance with relevant industry standards and regulations.
    • Maintain comprehensive documentation of resilience strategies, processes, and incident responses.
    • Participate in audits and reviews as required.
Requirements:
  • Bachelor's degree in Computer Science, Software Engineering, or a related field.
  • Proficiency in .NET framework.
  • Strong knowledge of servers such AWS, Azure, and on-premise servers.
  • Familiarity with version control tools like Git.
  • Experience with complex L3 queries and solutions related to server scalability.
  • Interest and enthusiasm for technology processes.
  • Collaborative skills and a team-oriented mindset.
  • Accountability and commitment to the job.
  • Willingness to learn and adapt to new technologies.
  • Fast learning ability and problem-solving skills.
  • Effective communication skills and analytical thinking.

We provide:

  • Health insurance

D.Engage is an equal opportunity employer committed to diversity and creating an inclusive workplace.

Obtém a tua avaliação gratuita e confidencial do currículo.
ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.