Senior Site Reliability Engineer - (Remote - Europe)
Get AI-powered advice on this job and more exclusive features.
Jobgether offers ALL remote jobs globally. We match you to roles where you're most likely to succeed and provide feedback on every application to help you learn. No more guesswork, application black holes, or recruiter ghosting in your job search.
We are looking for a Senior Site Reliability Engineer for one of our clients, remotely from Europe.
As a Senior SRE, you will design, maintain, and optimize reliable and scalable systems. Your responsibilities include tracking performance metrics, automating to improve system reliability, and ensuring best practices for incident management. Your expertise in cloud services, container orchestration, and system performance will drive initiatives to enhance infrastructure efficiency and robustness, collaborating closely with engineering teams to build high-availability systems. This role is ideal for someone passionate about maintaining resilient systems that ensure seamless operations at scale.
Accountabilities :
- Develop and maintain reliable, scalable, and efficient systems
- Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure system performance
- Conduct blameless post-incident reviews, identify root causes, and implement preventive measures
- Automate operational tasks, incident responses, and optimize system performance
- Collaborate with engineering teams to design for reliability, scalability, and maintainability
- Continuously evaluate and improve system performance, capacity, and cost efficiency
- Participate in on-call rotations, troubleshooting, and resolving critical issues
Requirements :
- Bachelor's degree in Computer Engineering or a related field
- 5+ years of experience as a Site Reliability Engineer or similar role
- 3+ years of experience with AWS services and container orchestration tools
- 2+ years of Kubernetes experience
- Strong knowledge of observability tools (monitoring, logging, tracing)
- Hands-on experience with Terraform for infrastructure as code
- Proficiency in at least one programming language (e.g., Python, Go, Java)
- Experience with incident management, postmortem analysis, and risk mitigation
- Familiarity with messaging systems like SNS, SQS, and CI / CD tools
- Fluent in English with strong communication skills
Benefits :
- Fully remote role with flexible working locations
- Competitive salary and performance incentives
- Health insurance coverage
- Annual wellness and learning credits for professional growth
- Work-from-anywhere stipend
- Annual company retreat to an exciting destination
- Inclusive, diverse, and collaborative work environment
- Seniority level : Mid-Senior level
- Employment type : Full-time
- Job function : Information Technology
- Industries : Non-profit Organizations, Education
J-18808-Ljbffr