Platform Engineering & Infrastructure
- Design, build, and maintain cloud infrastructure on Azure (or AWS/GCP), leveraging cloud-native services and patterns
- Own the complete CI/CD pipeline strategy - from code commit to production deployment with zero-downtime releases
- Implement Infrastructure as Code using Terraform, ARM templates, or similar tools for repeatable, version-controlled infrastructure
- Build and maintain containerization and orchestration platforms (Docker, Kubernetes/AKS)
- Design and implement comprehensive observability: metrics, logging, tracing, and alerting across all systems
- Establish security best practices, compliance controls, and automated security scanning in the deployment pipeline
Reliability & Performance
- Own platform SLAs and work to continuously improve system reliability, availability, and performance
- Build automated testing and validation into deployment pipelines to catch issues before production
- Implement disaster recovery strategies and conduct regular DR testing
- Performance tune infrastructure and applications, identifying and resolving bottlenecks
- Create and maintain runbooks for incident response and system operations
Developer Productivity & Tooling
- Build self-service platforms and tools that empower development teams to deploy and manage their services
- Create and maintain developer documentation, including onboarding guides and infrastructure usage patterns
- Implement automated environments (dev, staging, production) with consistent configuration
- Establish golden paths and platform standards that make the right way the easy way
Technical Leadership & Mentorship
- Mentor and develop an expert DevOps engineering team, providing technical guidance and career development support
- Conduct regular 1:1s and create an individual development plan for your direct report
- Lead by example - write production code, participate in on-call rotations, and debug complex issues
- Evangelize DevOps best practices across engineering teams through documentation, demos, and pair programming
- Collaborate closely with the Lead Engineer and other technical leads on cross-platform initiatives
Stakeholder Management
- Communicate infrastructure roadmap and technical initiatives to engineering leadership
- Translate business requirements into infrastructure capabilities and timelines
- Partner with security, compliance, and infrastructure teams on enterprise integrations
- Present platform metrics and improvements to senior leadership
Requirements:
Technical Expertise
Must demonstrate high proficiency in at least 3 of the following areas:
- Cloud Platforms: 7+ years of hands-on experience with Azure (preferred) or AWS/GCP, including deep knowledge of cloud-native services
- CI/CD & Automation: Expert-level experience building and maintaining complex CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, Azure DevOps)
- Infrastructure as Code: Strong experience with Terraform, ARM templates, CloudFormation, or similar IaC tools
- Container Orchestration: Production experience with Docker and Kubernetes/AKS, including cluster management and service mesh
- Scripting & Automation: Proficiency in Python, Bash, PowerShell, or Go for automation and tooling
- Observability: Experience implementing comprehensive monitoring, logging, and tracing solutions (Prometheus, Grafana, ELK, Datadog, Application Insights)
- Security & Compliance: Understanding of cloud security best practices, secrets management, and compliance frameworks
- Networking: Strong grasp of cloud networking, VPNs, load balancers, API gateways, and DNS
Leadership & Management
- 5+ years of technical leadership experience with direct reports or as a senior DevOps/platform engineer
- Demonstrated ability to mentor and develop engineering talent in DevOps practices
- Experience managing performance and providing career guidance to team members
- Proven ability to balance significant hands-on work (60%+) with leadership responsibilities
Professional Qualities
- Strong systems thinking - ability to see how components interact and design for reliability
- Excellent troubleshooting skills - can quickly diagnose and resolve complex infrastructure issues
- Automation-first mindset - strong bias toward automating repetitive tasks
- Clear communicator who can explain complex infrastructure concepts to developers and non-technical stakeholders
- Growth mindset with continuous learning of new cloud services and DevOps practices
- Comfortable with being on-call and responding to production incidents
Preferred Qualifications
- Previous experience in banking, financial services, or highly regulated industries
- Experience with banking-specific compliance requirements (PCI-DSS, SOC 2)
- Understanding of infrastructure requirements for digital channels in corporate/institutional or wealth management/private banking business domains
- Experience building platform engineering teams from early stages
- Contributions to open source DevOps tools or active participation in DevOps communities
- Experience with GitOps practices and tools (ArgoCD, Flux)