Ativa os alertas de emprego por e-mail!

Site Reliability Engineer (Sre)

Metacto

Taboão da Serra

Teletrabalho

BRL 160.000 - 200.000

Tempo integral

Hoje

Torna-te num dos primeiros candidatos

Cria um currículo personalizado em poucos minutos

Consegue uma entrevista e ganha mais. Sabe mais

Resumo da oferta

A leading tech firm is seeking a Site Reliability Engineer (SRE) for a remote position. The ideal candidate will have 5-10 years of experience in SRE or Cloud Engineering, with expertise in AWS cloud services and strong knowledge of Docker and Kubernetes. Responsibilities include maintaining cloud infrastructure and optimizing database performance, aiming for reliability and security in innovative applications.

Qualificações

5-10 years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering roles.
Expertise in AWS cloud services including EC2, RDS, S3, Lambda, EKS.
Hands-on experience with Docker and Kubernetes.

Responsabilidades

Architect, build, and maintain cloud infrastructure on AWS.
Manage and optimize databases for performance and security.
Implement monitoring and logging solutions for system health.

Conhecimentos

AWS cloud services

Containerization (Docker)

Orchestration (Kubernetes)

Relational databases (MySQL, PostgreSQL)

Infrastructure-as-Code (Terraform)

CI/CD pipelines

Monitoring tools (Zabbix)

Scripting (Python, Bash)

Django-based applications

About Us

At MetaCTO, we specialize in helping startups and growing companies turn visionary ideas into successful digital products through expert app development and fractional CTO services.

As a Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and security of the backend infrastructure that powers innovative applications for our clients.

This role will involve managing cloud environments, optimizing databases, automating deployments, and improving system observability.

Job Description

As a Site Reliability Engineer (SRE) at MetaCTO, you will be responsible for designing, implementing, and maintaining highly available, scalable, and secure infrastructure solutions.

You will collaborate with software engineers to improve system performance, automate operations, and ensure the smooth functioning of critical backend services.

You’ll work extensively with cloud platforms like AWS, leveraging technologies such as Terraform, Docker, Kubernetes, and CI/CD pipelines to enhance system reliability.

Responsibilities

Architect, build, and maintain cloud infrastructure on AWS (Lambda, EC2, RDS, S3, EKS, SQS, CloudWatch).
Manage and optimize databases (MySQL, PostgreSQL) for performance, reliability, and security.
Implement monitoring, alerting, and logging solutions to ensure system health and performance, with specific experience using Zabbix and Elastic Logging.
Design and maintain CI/CD pipelines for automated deployment and scaling of applications.
Work with containerization and orchestration tools such as Docker and Kubernetes.
Develop and enforce security best practices for cloud environments and infrastructure.
Automate operational processes using Infrastructure-as-Code (Terraform, CloudFormation) and scripting languages like Python or Bash.
Troubleshoot and resolve infrastructure-related incidents and optimize system performance.
Collaborate with backend engineers to ensure high availability, fault tolerance, and scalable system design, with a strong focus on Django-based applications.

Qualifications

5-10 years of experience in Site Reliability Engineering (SRE), DevOps, or Cloud Engineering roles.
Strong expertise in AWS cloud services (EC2, RDS, S3, Lambda, CloudFront, EKS, SQS, IAM).
Hands‑on experience with containerization (Docker) and orchestration (Kubernetes, ECS, or EKS).
Deep knowledge of relational databases (MySQL, PostgreSQL), including performance tuning, query optimization, monitoring, and migration management.
Proficiency in Infrastructure-as-Code tools such as Terraform, CloudFormation, or Pulumi.
Strong experience with CI/CD pipelines and automation tools (GitHub Actions, Jenkins, CircleCI, or GitLab CI/CD).
Proficiency in monitoring tools, specifically Zabbix, and logging solutions like Elastic Logging.
Scripting experience with Python, Bash, or Go for automating operational tasks.
Experience working with Django-based applications in a cloud environment.
Experience implementing security best practices for cloud-based applications.
Knowledge of distributed systems and microservices architecture.

Preferred Skills

AWS certifications (Solutions Architect, DevOps Engineer) are a plus.
Experience with serverless computing and event‑driven architectures.
Familiarity with message queue services (SQS, RabbitMQ, Kafka).
Understanding of zero‑downtime deployments and disaster recovery strategies.

Position Details

Type: Full‑Time

Location: 100% Remote

Hours: US Pacific Time hours

How to Apply

If you are passionate about scalability, automation, and reliability, and thrive in a collaborative, fast‑paced environment, we’d love to hear from you.

Please submit your resume and an optional brief cover letter outlining your relevant experience.

MetaCTO is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Obtém a tua avaliação gratuita e confidencial do currículo.

ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.