Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
An innovative firm is seeking a Lead Technical SME to enhance capacity and observability controls across its technology estate. This role requires a blend of hands-on engineering and architectural oversight, focusing on performance, resilience, and control effectiveness. The ideal candidate will have a strong background in SRE principles, capacity planning, and operational control frameworks. Join a forward-thinking company where you can drive scalable solutions and collaborate with diverse teams to shape the future of technology operations. This is an exciting opportunity for those passionate about systems thinking and operational excellence.
Location :Birmingham/ Sheffield (Hybrid - 3 days a week)
Lead Technical SME - Capacity Planning & Control Uplift
Role Overview:
We are seeking a Lead Technical Subject Matter Expert (SME) with strong systems thinking and a solid grasp of SRE principles to drive the technical uplift of capacity and observability controls across our technology estate. This role blends hands-on engineering depth with architectural oversight and focuses on enhancing performance, resilience, and control effectiveness across services and platforms.
The ideal candidate brings both operational sensibility and the ability to drive scalable solutions — aligning technical capabilities with internal control frameworks and regulatory expectations.
Key Responsibilities:
• Lead the design and technical evaluation of capacity management, utilisation monitoring, and observability controls across platforms.
• Apply SRE-aligned practices to identify control gaps, performance risks, and areas for automation.
• Assess existing tooling, data flows and operational practices to identify control gaps and propose remediation strategies.
• Collaborate with engineering, infrastructure, architecture, and risk teams to validate technical designs and implementation plans.
• Define reusable technical patterns and tooling strategies that enhance operational readiness and control sustainability.
• Support roadmap shaping, tooling assessment, and documentation for governance and operational readiness.
Required Skills & Experience:
• 10+ years in engineering, infrastructure, or technical architecture roles in complex technology environments.
• Solid understanding of compute, storage, and network capacity planning across mixed deployment models.
• Familiarity with SRE disciplines such as observability, service-level indicators/objectives (SLIs/SLOs), and automation of operational tasks.
• Demonstrated ability to interpret and apply control requirements in technical design contexts.
• Hands-on experience with performance monitoring, alerting systems, and diagnostic tooling (e.g., Geneos, Prometheus, Grafana, AppDynamics, or similar tools).
• Strong communication skills — able to convey technical concepts to senior stakeholders and control partners.
• Experience in implementing or uplifting operational controls (capacity, performance, availability).
• Exposure to internal risk frameworks or external regulatory requirements (e.g., DORA, EBA, PRA).
• Background in service reliability, system diagnostics, or incident response.
Please enter your email address to continue setting up an email alert for similar jobs to this one. By entering your email address and clicking apply you will sign up to Jobs4 and agree to our terms and conditions .
ID:
1464721
Date Posted:
Posted 1 day ago
Expiration Date:
03/06/2025
Location:
Birmingham
Competitive
Please enter your email address to continue setting up an email alert for similar jobs to this one. By entering your email address and clicking apply you will sign up to Jobs4 and agree to our terms and conditions .
Complete the form below to send this job to a friend.
Complete the form below to report this job.