¡Activa las notificaciones laborales por email!

Lead SRE Engineer

Screening Eagle Technologies

Málaga

Presencial

EUR 50.000 - 70.000

Jornada completa

Hace 30+ días

Mejora tus posibilidades de llegar a la entrevista

Elabora un currículum adaptado a la vacante para tener más posibilidades de triunfar.

Descripción de la vacante

An innovative firm is seeking a Site Reliability Engineer Lead to guide a talented team in ensuring the stability and scalability of cloud services. This role offers the opportunity to leverage cutting-edge technologies and practices in AWS, Terraform, and Kubernetes, while leading efforts in automation, testing, and engineering. You will play a crucial role in enhancing monitoring systems, optimizing resources, and driving process improvements. If you are passionate about technology and thrive in a collaborative environment, this is the perfect opportunity to make a significant impact on the company's success.

Formación

  • 5+ years of experience in AWS cloud infrastructure development.
  • Expert-level proficiency in Terraform and Kubernetes.

Responsabilidades

  • Lead a team of SREs to ensure service stability and scalability.
  • Design and implement cloud infrastructure while optimizing costs.

Conocimientos

AWS Cloud Infrastructure
Terraform
Kubernetes
DevOps Practices
Non-Functional Testing
Git and GitOps
Monitoring Tools (ELK, Prometheus, Grafana)
MLOps
Cost Optimization
Agile Methodologies

Herramientas

AWS (EC2, S3, VPC, IAM)
CI/CD Pipelines
Logging and Monitoring Tools

Descripción del empleo

The Site Reliability Engineer Lead (SRE Lead) at Screening Eagle will lead a team of SREs to ensure the stability, resilience, and scalability of our services through automation, testing, and engineering. This role involves leveraging expertise from product systems / operations, cloud infrastructure (AWS), build and release engineering, software development, and stress / load testing to guarantee our services are available, cost-efficient, and fit for purpose from the early stages of development. 5+ years of experience developing AWS cloud infrastructure and 7+ years of experience leading teams.

What will you do

Cloud Infrastructure Management and Networking

  • Design, develop, and implement cloud infrastructure using Terraform.
  • Optimize resources for cost-efficiency and performance.
  • Ensure infrastructure security and implement service control policies (e.g., Control Tower).
  • Configure AWS VPC flow logs, load balancer logging, Direct Connect, AWS VPN, TGX, etc.

Monitoring, Support, and Prototyping

  • Implement robust monitoring and alerting systems.
  • Set up and monitor CI / CD pipelines both on-premises and in the cloud.
  • Enhance monitoring, logging, and alerting practices.
  • Use tagging and cost categorization for cost analysis.
  • Create prototypes and lead development teams in implementing solutions.

Team Leadership, Collaboration, and Documentation

  • Lead the SRE team, ensuring technical quality and best practices.
  • Guide the team through the software development lifecycle.
  • Collaborate with developers and operations to integrate infrastructure changes.
  • Document DevOps changes, technical partnerships, design, integration, testing, and deployment.

Innovation, Quality Assurance, and Process Improvement

  • Evaluate risks, customize applications, and lead quality practices.
  • Focus on agile methodologies, test automation, and continuous integration.
  • Simplify and automate complex processes to ensure quality and operational excellence.
  • Improve the DevOps toolchain and streamline software delivery processes.
  • Stop projects / products if solutions are not technically acceptable.

What do we expect

  • Extensive experience in implementing and evolving DevOps practices across multi-disciplinary teams and business frameworks.
  • Strong background in leading technology change programs and managing projects.
  • In-depth knowledge and experience with AWS services (EC2, S3, VPC, IAM, etc.).
  • Expert-level proficiency in Terraform, including writing reusable modules and leveraging best practices.
  • Highly skilled with Kubernetes, Terraform, serverless and AWS in general.
  • Proficient in non-functional testing, including performance, security, and cost optimization.
  • Experience working with advanced architectures such as ARM and AWS Graviton, optimizing for performance, cost-efficiency, and scalability.
  • Knowledge of K8S operator programming and those related with GPU based architectures
  • Competent in using different arch build tools and practices.
  • Expertise in Git and GitOps philosophy.
  • Expert in logging and monitoring tools like ELK, Prometheus, and Grafana.
  • Demonstrable MLOps experience.
  • Ability to quickly gain domain knowledge.
  • Operational experience in maintaining applications.
Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.