Business Function
Group Technology and Operations (T&O) enables and empowers the bank with an efficient, nimble, and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability, and innovation. In Group T&O, we manage the majority of the Bank's operational processes and aim to delight our business partners through our multiple banking delivery channels.
The Regulatory Reporting team leads the overall front-to-back strategy and development of the regulatory reporting landscape and drives business changes for the Group Finance Platform (GFP).
Responsibilities:
As a DevSecOPS Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Your work will focus on optimizing existing systems, building infrastructure, and reducing work through automation. You'll join a team of curious problem solvers with diverse perspectives, thinking big and taking risks. In this environment, you'll lead relevant projects with support from an organization that fosters learning and growth. Your focus as an SRE will be on running better production applications and systems.
- Develop, test, and debug automated tasks (Apps, Systems, Infrastructure)
- Troubleshoot priority incidents, facilitate blameless post-mortems
- Work with development teams throughout the software lifecycle ensuring sustainable software releases
- Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
- Build and promote adoption of greater self-healing and resiliency patterns
- Lead and participate in performance tests; identify bottlenecks, opportunities for optimization, and capacity demands
- Adhere to architecture standards, risk management, and security policies
- Collaborate with global teams, product owners, and business teams to develop, build, and support applications
- Communicate and coordinate on development items with global teams and resolve issues impacting development
- Provide post-production application support
- Participate in quality assurance, peer reviews, and code reviews
Requirements:
- Strong technical skills with innovative solutions that benefit our global customers
- Thought leadership in end-to-end solutioning
- Up-to-date with industry technologies
- Active engagement in developer and tech meetups
- Strong understanding of DevOps practices, tools, and techniques
- Experience deploying applications in containerized environments (Docker, Kubernetes, PCF, OpenShift, AWS)
- In-depth OS experience (RHEL, Ubuntu, Windows Server) with troubleshooting skills
- Experience in site reliability engineering using languages like Python, Java, PowerShell, Shell scripting, or Go
- Hands-on experience with cloud technologies and tools such as Prometheus, Splunk, Elasticsearch, Grafana
- Proficiency with modern development tools and methodologies (Agile, CI/CD, Git, Terraform, Jenkins)
- Good understanding of networking protocols and cybersecurity in cloud environments
- Advanced SQL knowledge, experience with RDBMS, Hadoop, and NoSQL databases
- Experience in root cause analysis and data processing to improve business processes
- Knowledge of message queuing, stream processing, and scalable big data stores
- Experience collaborating with cross-functional teams in dynamic environments