Aktiviere Job-Benachrichtigungen per E-Mail!

Director, Site Reliability Engineering (SRE) Shared Services Leader

BMA Group Panama

Deutschland

Vor Ort

EUR 100.000 - 130.000

Vollzeit

Gestern

Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A global healthcare leader is seeking a Director of Site Reliability Engineering to lead SRE initiatives and guarantee operational excellence. This role requires a strong background in IT operations, specializing in hybrid-cloud environments and SRE methodologies. The ideal candidate will possess extensive experience in technical leadership, driving reliability frameworks, and managing multidisciplinary teams. This position offers the chance to influence the development of critical data analytics services in a dynamic environment.

Leistungen

Competitive salary

Diverse workplace

Health benefits

Qualifikationen

Minimum of 10 years in IT operations with 5 years in SRE roles.
Expertise in reliability engineering principles and frameworks.
Proficient in IAM, RBAC, ABAC concepts.

Aufgaben

Define strategic vision for SRE Shared Services.
Lead multidisciplinary SRE teams to enhance performance.
Oversee operations ensuring high reliability and security.

Kenntnisse

SRE methodologies

Cloud operations (Azure, AWS, GCP)

Linux/Unix systems

Kubernetes orchestration

Leadership and collaboration

Ausbildung

Bachelor's or master's degree in computer science, Information Systems, or related field

Tools

Terraform

Helm

Ansible

Grafana

Prometheus

Director, Site Reliability Engineering (SRE) Shared Services Leader

Location: Puerto Rico, USA

Business Unit: Cencora Puerto Rico – Data Analytics Services & Solutions

Reports To: Senior Director – Data Services & Solutions

Role Type: People Manager – Senior Leadership Role

Job Type: Full-Time

About Cencora

Cencora (formerly AmerisourceBergen) is a global healthcare leader committed to improving lives by advancing the development and delivery of pharmaceuticals and healthcare products. Our Data Services & Solutions organization is at the core of Cencora’s digital transformation building, securing, and operating data‑driven platforms that power intelligent, compliant, and resilient supply‑chain decision‑making worldwide.

Position Summary

The Director, Site Reliability Engineering (SRE) Shared Services Leader provides executive‑level leadership and strategic direction for all the infrastructure where Cencora’s critical supply chain data services are executed from.

This individual is accountable for the reliability, robustness, high availability, observability, and information security operations across all on‑premises and cloud environments used to deliver supply chain data analytics services products. Reporting directly to the Senior Director of Data Services & Solutions, this leader manages a multidisciplinary team that includes compute and network administrators, IAM/RBAC/ABAC and information security architecture professionals, observability and telemetry automation SMEs, Kubernetes engineers, and disaster recovery specialists.

This leader acts as the primary SRE champion across the entire Data Services & Solutions organization instilling a culture of reliability, robustness, performance, and proactive incident prevention. This individual will ensure maximum uptime, security and data service business continuity by applying modern SRE principles and automation strategies across hybrid infrastructure that integrates on‑premises systems with public cloud services.

By embedding reliability practices into engineering workflows, this role guarantees uninterrupted analytics and data platform operations critical to supporting Cencora’s mission of creating healthier futures.

Primary Responsibilities

Strategic Leadership & Reliability Governance

Define and execute the strategic vision for the SRE Shared Services organization, ensuring alignment with enterprise reliability, availability, and performance goals.
Serve as the organizational SRE champion, driving adoption of reliability engineering principles across all teams to achieve resilience, observability, and automation maturity.
Establish and own SLOs, SLIs, and SLAs for all Data Services platforms, integrating them into executive‑level reporting and operational scorecards.
Lead reliability and capacity planning across hybrid (on‑prem + cloud) infrastructure environments, thus balancing performance, scalability, availability and cost efficiency.
Partner with DataOps, DevOps, and Product Management teams to embed reliability by design, into platform architecture and application releases.

Team Leadership & Operational Excellence

Lead a multidisciplinary SRE organization composed of compute and network engineers, security professionals, observability SMEs, Kubernetes administrators, and DR specialists.
Attract, retain, develop and coach top engineering talent, establishing clear goals, career development plans, and succession pathways.
Foster a proactive culture of accountability, automation, and operational excellence through mentorship and process improvement.
Oversee 24×7 SRE operations and major incident response activities, ensuring automated rapid detection, resolution, and root‑cause documentation.
Implement a continuous feedback loop between incidents, change management, and service design for ongoing improvement.

Compute, Network & Platform Reliability Accountabilities

Accountable for overseeing hybrid infrastructure reliability, including compute clusters, virtualization and hardware orchestration platforms, cloud and on‑premises VMs, and containerized workloads.
Ensure optimal network connectivity, routing, and load balancing between on‑premises and cloud environments supporting analytics and data services applications.
Manage capacity forecasting, patching schedules, and automation of routine maintenance using Infrastructure‑as‑Code (Terraform, Ansible, Helm and ArgoCD).
Monitor latency, throughput, and performance metrics to prevent degradation of critical systems.
The health of data analytics services should be continuously monitored through highly granular plus fully automated observability and telemetry services. This constant monitoring has the goal of preventing end‑user detectable service degradations or interruptions, thus ensuring Cencora’s supply chain customers and partners’ SLAs are consistently met or surpassed.

Information Security Architecture & Operations

Direct a team of IAM, RBAC, and ABAC engineers ensuring secure, role‑based, and auditable access control across all environments.
Oversee the Security Operations Center (SOC), ensuring 24×7 threat detection, vulnerability management, and security incident response.
Collaborate with Cencora’s enterprise CISO and InfoSec teams to align reliability and security objectives with corporate frameworks (NIST, ISO, HIPAA).
Ensure audit readiness and compliance with HIPAA, GxP, 21 CFR Part 11, and ISO 27001 certifications.

Observability, Telemetry & Automation

Lead the observability and telemetry automation team responsible for implementing tools such as Grafana, Prometheus, Datadog, Elastic, and equivalent monitoring ecosystems.
Enable end‑to‑end system visibility across data pipelines, analytics applications, and infrastructure layers.
Develop predictive alerting and automated root‑cause correlation through AI/ML‑driven observability tools.
Partner with Data Operations, QA, and DevOps leaders to integrate observability into CI/CD and data lifecycle automation pipelines.

Kubernetes, Cloud Operations & Disaster Recovery

Oversee the enterprise Kubernetes and container management platform, ensuring proper configuration, scaling, and cluster health.
Direct DR and business continuity teams to validate replication, cross‑region backups, and failover mechanisms that meet or exceed RPO/RTO targets.
Coordinate hybrid‑cloud disaster recovery exercises to verify production resilience and recovery automation.

Cross‑Functional Collaboration & Strategic Alignment

Represent Site Reliability Engineering on the Core Product Governance Council and senior architecture forums across the Company.
Collaborate with DevOps, Data Operations, QA, and Supply Chain Analytics leaders to align to reliability, observability, and security objectives.
Coordinate with Finance and Procurement on vendor management, licensing, and capacity cost optimization.
Establish transparent communications with executive stakeholders on uptime, service health, and operational risks.

Measurable Outcomes & Success Metrics

≥ 99.95% uptime across all critical data and analytics environments.
Mean Time to Detect (MTTD) < 5 minutes and Mean Time to Resolve (MTTR) < 30 minutes for critical incidents.
100% compliance with enterprise security and audit requirements.
Continuous reduction in unplanned downtime, service degradations, and recurring incidents to ensure optimal service delivery.
Complete implementation of unified observability dashboards covering all shared services.
Validated disaster‑recovery exercises achieving documented RPO/RTO objectives at least twice a year.
Increased SRE adoption across Data Services teams as measured by reliability KPIs.

Qualifications & Leadership Competencies

Bachelor's or master's degree in computer science, Information Systems, or related technical field.
10+ years of progressive IT operations or infrastructure experience with at least 5 years leading hybrid‑cloud SRE or reliability teams.
Expertise in SRE methodologies, reliability frameworks, and hybrid‑cloud operations (Azure, AWS, GCP).
Advanced knowledge of Linux, Windows Server and/or Unix systems, networking, and Kubernetes orchestration in production environments.
Hands‑on experience implementing Infrastructure‑as‑Code (Terraform, Helm, Ansible, ArgoCD) and CI/CD pipelines (using GitHub Actions automation experience preferred).
Deep understanding of IAM, RBAC, ABAC, and enterprise identity governance.
Proficiency in observability and monitoring tools (e.g. Grafana, Prometheus, Datadog, ELK, Azure Monitor).
Proven track record leading Security Operations, vulnerability management, and incident response programs.
Strong communication, leadership, and cross‑functional collaboration skills.

Preferred Certifications

Certified Kubernetes Administrator (CKA) or Kubernetes Security Specialist (CKS).
AWS Certified DevOps Engineer or Azure Solutions Architect Expert.
Certified Information Systems Security Professional (CISSP) or CISM.
ITIL 4 Foundation for Reliability and Service Management.

Language Requirements

Fully fluent in English is a must (writing, reading, listening and speaking). Bilingual (English & Spanish) is preferred.

Strategic Impact

The Director, Site Reliability Engineering Shared Services, ensures that every data, analytics, and application platform at Cencora operates with continuous reliability, robust security, and proactive observability. As the organizational SRE champion, this leader drives cultural and technical adoption of reliability engineering principles across all Shared Services. By integrating hybrid‑cloud infrastructure operations with world‑class SRE practices, they enable Cencora to deliver uninterrupted, high‑performing analytics services that support global healthcare partners and patients.

Our Commitment

We are united in our responsibility to create healthier futures and value diversity in all its forms. We believe innovation thrives through collaboration, diverse perspectives, and a shared purpose to create healthier futures worldwide.

Equal Employment Opportunity

Cencora is committed to providing equal employment opportunity without regard to race, color, religion, sex, sexual orientation, gender identity, genetic information, national origin, age, disability, veteran status or membership in any other class protected by federal, state or local law.

The company’s continued success depends on the full and effective utilization of qualified individuals. Therefore, harassment is prohibited, and all matters related to recruiting, training, compensation, benefits, promotions and transfers comply with equal opportunity principles and are non‑discriminatory.

Cencora is committed to providing reasonable accommodation to individuals with disabilities during the employment process, which is consistent with legal requirements. If you wish to request accommodation while seeking employment, please call 888.692.2272 or email hrsc@cencora.com. We will make accommodation determinations on a request‑by‑request basis. Messages and emails regarding anything other than accommodation requests will not be returned.

Summary List

Director, Site Reliability Engineering (SRE) Shared Services Leader
About Cencora
Position Summary
Primary Responsibilities
- Strategic Leadership & Reliability Governance
- Team Leadership & Operational Excellence
- Compute, Network & Platform Reliability Accountabilities
- Information Security Architecture & Operations
- Observability, Telemetry & Automation
- Kubernetes, Cloud Operations & Disaster Recovery
- Cross‑Functional Collaboration & Strategic Alignment
Measurable Outcomes & Success Metrics
Qualifications & Leadership Competencies
Language Requirements
Strategic Impact
Our Commitment
Equal Employment Opportunity

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Städte

Top-Unternehmen

Beliebte Jobs