Enable job alerts via email!

Senior Director - Operations and Reliability Engineering

ZipRecruiter

London

Hybrid

GBP 90,000 - 150,000

Full time

19 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Senior Director of Operations and Reliability Engineering to lead the integration of Site Reliability Engineering, DevOps, and traditional operations. This pivotal role focuses on ensuring end-to-end automation, operational excellence, and high availability across global IT infrastructure. The ideal candidate will drive strategic planning and optimization while embedding compliance with IT Service Management processes. If you're an innovative leader passionate about leveraging automation and AI to enhance operational efficiency, this opportunity is perfect for you.

Qualifications

  • 15+ years in IT operations, SRE, DevOps, or platform engineering.
  • 5+ years in senior leadership managing large-scale IT environments.

Responsibilities

  • Define and execute a modern Reliability Engineering strategy.
  • Oversee IT infrastructure, cloud platforms, and hybrid environments.

Skills

IT Operations
Site Reliability Engineering (SRE)
DevOps
Cloud Computing (AWS, Azure, GCP)
Automation
Infrastructure as Code (IaC)
Security Compliance
Leadership

Education

Bachelor's Degree in Computer Science or related field
Master's Degree (preferred)

Tools

Kubernetes
Terraform
Ansible

Job description

Job Description

Locations: Canary Wharf | Boston

Who We Are

Boston Consulting Group partners with leaders in business and society to tackle their most important challenges and capture their greatest opportunities. Founded in 1963, BCG pioneered business strategy and now helps clients with total transformation—driving complex change, enabling growth, building competitive advantage, and delivering bottom-line impact. Success requires blending digital and human capabilities.

Our diverse, global teams bring deep industry and functional expertise and a range of perspectives to spark change. BCG delivers solutions through management consulting, technology and design, corporate and digital ventures, and business purpose. We work collaboratively across all levels of the client organization to generate results that enable clients to thrive.

What You'll Do

The Senior Director – Operations and Reliability Engineering is responsible for integrating Site Reliability Engineering (SRE), DevOps, and traditional operations to develop a next-generation Reliability Engineering function.

This role ensures end-to-end automation at scale, 24x7 operational excellence, and high availability across all BCG entities worldwide. The leader will drive strategic planning, execution, and optimization of global IT infrastructure, cloud operations, and service management, while ensuring a secure, scalable, and efficient technology environment. The role also involves embedding and ensuring compliance with IT Service Management (ITSM) processes across all teams, aligned with standardized frameworks and operational excellence.

Key Responsibilities
  1. Strategic Leadership & Transformation: Define and execute a modern Reliability Engineering strategy, integrating SRE, DevOps, and automation; drive automation to eliminate toil and improve efficiency; lead transition to AI-driven, self-healing infrastructure; establish observability and analytics frameworks; align strategies with business goals.
  2. Infrastructure & Cloud Operations: Oversee IT infrastructure, cloud platforms, and hybrid environments; manage network reliability, compute, and cloud services across AWS, Azure, and GCP; scale Infrastructure as Code (IaC), automation, and workload optimization; implement AI-driven monitoring and self-healing automation.
  3. IT Service Management & Operational Excellence: Mandate adoption of ITSM processes; establish operational metrics including SLOs, SLIs, error budgets; oversee incident response and root cause analysis with AI; ensure high availability, performance, and security compliance; develop a 24/7 operational support model; optimize incident, change, and capacity management; lead Service Asset and Configuration Management (SACM).
  4. Security, Compliance & Risk Management: Embed security and compliance into workflows; ensure adherence to ISO 27001, NIST, SOC 2, GDPR, and cloud security standards; collaborate on zero-trust security models; drive resiliency, disaster recovery, and business continuity initiatives.
  5. Financial & Vendor Management: Optimize operational budgets with a cloud strategy; negotiate vendor contracts; drive cost efficiency in cloud and infrastructure investments.
  6. Leadership & Talent Development: Build and mentor a high-performing Reliability Engineering team; foster a culture of automation and innovation; promote a collaborative, data-driven, proactive mindset; develop workforce programs for AI-driven operations and modern reliability practices.
What You'll Bring

Required Qualifications: 15+ years in IT operations, SRE, DevOps, or platform engineering; 5+ years in senior leadership managing large-scale IT environments; deep expertise in cloud computing (AWS, Azure, GCP), on-prem, and hybrid; proven experience in automation, IaC, observability, and AI-driven IT operations; strong understanding of security, compliance, and risk management; excellent leadership and stakeholder management skills.

Preferred Certifications: ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or similar; experience with Kubernetes, Terraform, Ansible, and AI operations tools; strong problem-solving skills with a data-driven approach.

Additional Information

This pivotal leadership role involves shaping the future of IT operations by integrating SRE, DevOps, and automation methodologies. If you are a technically skilled, innovation-driven leader passionate about scaling operations through automation and AI resilience, we encourage you to apply.

Work Environment & Additional Details: Hybrid or on-site; occasional travel; fast-paced, high-availability environment focused on automation and reliability.

Boston Consulting Group is an Equal Opportunity Employer. All qualified applicants will receive consideration without regard to various protected characteristics. For more information, click here.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Director – Operations and Reliability Engineering

Boston Consulting Group (BCG)

London

Hybrid

GBP 120,000 - 150,000

6 days ago
Be an early applicant

Commercial Operations Director - Remote Working

JR United Kingdom

London

Remote

GBP 80,000 - 120,000

9 days ago

Commercial Operations Director - Remote Working

We Are Aspire

London

Remote

GBP 80,000 - 120,000

5 days ago
Be an early applicant

Commercial Operations Director - Remote Working

ZipRecruiter

London

Remote

GBP 80,000 - 120,000

6 days ago
Be an early applicant

Commercial Operations Director - Remote Working

WeAreAspire

London

Remote

GBP 80,000 - 120,000

7 days ago
Be an early applicant

Director Live Operations

PowerOptions, Inc.

London

Remote

GBP 80,000 - 120,000

Yesterday
Be an early applicant

Manager, Pricing Operations - UK & Ireland

Ticketmaster

London

Remote

GBP 100,000 - 160,000

3 days ago
Be an early applicant

Manager, Pricing Operations - UK & Ireland

Live Nation Entertainment

London

Remote

GBP 100,000 - 160,000

6 days ago
Be an early applicant

Director Live Operations

Bauer Media Group Heinrich Bauer Verlag KG

London

Remote

GBP 70,000 - 120,000

2 days ago
Be an early applicant