Job Description
We are looking for a Lead DevOps/SRE Engineer with proven DevSecOps skills to support our transformation and the future growth of the Canada Life UK business. We are on a journey of simplifying our IT estate, removing legacy systems, and delivering innovative solutions through Cloud adoption, AI, and expanding digital services.
As a member of the CLUK Product Engineering team, you will be involved in every stage of the product lifecycle from conception, design, implementation, testing, to operational support. With automation at the core of our work, you should have a strong hands-on understanding of CI/CD tools to build secure and reliable applications.
We prefer candidates with Azure experience, though strong experience in any cloud platform is welcome.
We understand everyone has different work and life responsibilities. We offer flexible working arrangements with a hybrid work model, requiring onsite presence for team meetings and events. Our office locations include London, Potters Bar, Bristol, and the Isle of Man.
Key responsibilities
- Design, implement, and maintain applications in Cloud and On-prem environments through CI/CD pipelines.
- Develop automation scripts for deployment and maintenance of applications.
- Monitor system performance and health; identify opportunities for optimization.
- Ensure compliance with all security policies and standards.
- Troubleshoot and resolve issues; identify root causes.
- Participate in technical discussions and provide recommendations for improvements.
- Document system configurations and processes.
- Provide support to other teams as needed.
- Guide less experienced Product Engineering team members to develop their skills.
Essential skills needed
- Application support within a Financial Services context (SaaS, PaaS, Cloud, On-Prem).
- Relevant experience as a DevOps or SRE Engineer.
- Proficiency in scripting languages like PowerShell, Python, Ruby, Bash, and programming in C# for automation.
- Strong experience with cloud environments such as Azure, AWS, or GCP.
- Experience with containerization technologies like Docker and Kubernetes.
- Good engineering practices, especially automation, CI/CD tools like Azure DevOps, Jenkins, GitHub Actions, etc.
- Excellent API and interfacing skills.
- Experience working within Agile frameworks and mentoring others.
- Understanding of networks, firewalls, load balancers, and related infrastructure.
- Strong problem-solving and troubleshooting skills.
- Effective communication and collaboration skills.
- Ability to work independently and in teams.
Desirable skills
- Experience migrating on-prem infrastructure to the cloud.
- Knowledge of cloud security best practices, IAM, and encryption.
- Microsoft Azure certifications are advantageous.
Observability
- Designing and using logging and monitoring tools like DataDog or Application Insights.
- Building applications and infrastructure for observability, security, and reliability.
Networking & Security
- Monitoring and enhancing network performance with security and scalability in mind.
- Implementing security best practices in AKS, including network policies, RBAC, and Azure AD.
Core Services
- Experience with Azure services such as Storage, VMs, Load Balancers, Azure SQL, Container Instances, Kubernetes, and Docker.
- Development experience, especially in .NET.
- SQL skills for database management and interaction.
CI/CD
- Designing and managing CI/CD pipelines using Azure DevOps and Octopus Deploy, particularly for containerized applications.
Containerization
- Implementing containerization strategies using AKS, Terraform, and ARM templates.
- Managing deployment, scaling, and performance of containerized applications in AKS.
- Managing Azure Kubernetes Service (AKS) clusters effectively.
Other Features
- Leading migration of infrastructure and applications to Azure with minimal downtime.
- Automating operational processes with development and operations teams.
- Providing architectural guidance and supporting knowledge sharing.
- Collaborating on deployment strategies, troubleshooting, and performance support.
- Creating and maintaining documentation; providing training and support.