Overview
We are looking for a Service Reliability Engineer & Application Maintenance Specialist to ensure the reliability, scalability, and cost optimization of our cloud platforms.
The ideal candidate will have strong experience in automation, proactive monitoring, and performance management, with a mindset focused on continuous improvement.
Responsibilities
- Support devOPS team in implementing Infrastructure as Code (IaC) pipelines
- Define recovery automations (self-healing, blue/green deployments) through resilient and automated processes
- Define and monitor SLIs/SLOs to ensure service quality and create intelligent alerts on New Relic and act proactively to prevent incidents
- Design autoscaling and capacity planning strategies for AWS cloud-native environments
- Design and execute disaster Recovery strategies for AWS cloud-native environments
- Design load test, performance test, endurance test and long running tests, and lead the execution
- Design chaos engineering practices to improve overall systems\' reliability
- Ensure optimal performance during peak scenarios without resource waste
- Lead structured post-mortems and technical retrospectives
- Turn every failure into an opportunity to strengthen the infrastructure
- Apply rightsizing and cloud cost optimization best practices
- Ensure Incidents are managed properly and reactively according to Group SLAs
- Perform application maintenance activities, including troubleshooting, patching, and ensuring compliance with security and performance standards
- Collaborate with development teams to address application-level issues impacting reliability and scalability
- Monitor and optimize application performance across environments to prevent degradation and ensure SLAs are met
Requirements
Our ideal candidate will meet the following requirements:
- Degree in Computer Science, Information Technology, or a related field
- At least 5years of experience in digital technology development and architecture and in the management of complex IT projects
- AWS Certification (mandatory)
- Proven experience with Cloud Platforms (AWS)
- Strong knowledge of IaC (Terraform) and Gitlab CI/CD pipelines
- Expertise in monitoring and observability tools (e.g., New Relic)
- Familiarity with SLI/SLO/SLA concepts and SRE methodologies
- Experience with container orchestration
- Knowledge of FinOps and cloud cost optimization
- Experience working in teams preferably heterogeneous in terms of skills and people background
- Familiarity with Agile/Scrum methodologies
- Effective communication skills in both English and Italian
- Strong analytical and problem-solving skills, with a continuous improvement mindset
Nice to Have
- Additional cloud certifications (Azure)
- Additional IaC certifications (Terraform)
- Experience with scripting languages (Python, Go, Bash)
- Background in DevOps or Platform Engineering
Company Profile
Generali is a major player in the global insurance industry - a strategic and highly important sector for the growth, development and welfare of modern societies. Over almost 200 years, we have built a multinational Group that is present in more than 60 countries, with 470 companies and nearly 80,000 employees.
GOSP - Generali Operations Service Platform is a joint-venture between Generali and Accenture and provides IT and Procurement services to Generali Group companies. Our purpose is to accelerate the Group\'s innovation and digitization strategy through the Cloud and shared platforms. Based in Italy it has 6 branches across Europe and employs about 1.200 people.