Director, Site Reliability Engineering (SRE) Shared Services Leader
Location: Puerto Rico, USA
Business Unit: Cencora Puerto Rico – Data Analytics Services & Solutions
Reports To: Senior Director – Data Services & Solutions
Role Type: People Manager – Senior Leadership Role
Job Type: Full-Time
About Cencora
Cencora (formerly AmerisourceBergen) is a global healthcare leader committed to improving lives by advancing the development and delivery of pharmaceuticals and healthcare products. Our Data Services & Solutions organization is at the core of Cencora’s digital transformation building, securing, and operating data‑driven platforms that power intelligent, compliant, and resilient supply‑chain decision‑making worldwide.
Position Summary
The Director, Site Reliability Engineering (SRE) Shared Services Leader provides executive‑level leadership and strategic direction for all the infrastructure where Cencora’s critical supply chain data services are executed from.
This individual is accountable for the reliability, robustness, high availability, observability, and information security operations across all on‑premises and cloud environments used to deliver supply chain data analytics services products. Reporting directly to the Senior Director of Data Services & Solutions, this leader manages a multidisciplinary team that includes compute and network administrators, IAM/RBAC/ABAC and information security architecture professionals, observability and telemetry automation SMEs, Kubernetes engineers, and disaster recovery specialists.
This leader acts as the primary SRE champion across the entire Data Services & Solutions organization instilling a culture of reliability, robustness, performance, and proactive incident prevention. This individual will ensure maximum uptime, security and data service business continuity by applying modern SRE principles and automation strategies across hybrid infrastructure that integrates on‑premises systems with public cloud services.
By embedding reliability practices into engineering workflows, this role guarantees uninterrupted analytics and data platform operations critical to supporting Cencora’s mission of creating healthier futures.
Primary Responsibilities
Strategic Leadership & Reliability Governance
- Define and execute the strategic vision for the SRE Shared Services organization, ensuring alignment with enterprise reliability, availability, and performance goals.
- Serve as the organizational SRE champion, driving adoption of reliability engineering principles across all teams to achieve resilience, observability, and automation maturity.
- Establish and own SLOs, SLIs, and SLAs for all Data Services platforms, integrating them into executive‑level reporting and operational scorecards.
- Lead reliability and capacity planning across hybrid (on‑prem + cloud) infrastructure environments, thus balancing performance, scalability, availability and cost efficiency.
- Partner with DataOps, DevOps, and Product Management teams to embed reliability by design, into platform architecture and application releases.
Team Leadership & Operational Excellence
- Lead a multidisciplinary SRE organization composed of compute and network engineers, security professionals, observability SMEs, Kubernetes administrators, and DR specialists.
- Attract, retain, develop and coach top engineering talent, establishing clear goals, career development plans, and succession pathways.
- Foster a proactive culture of accountability, automation, and operational excellence through mentorship and process improvement.
- Oversee 24×7 SRE operations and major incident response activities, ensuring automated rapid detection, resolution, and root‑cause documentation.
- Implement a continuous feedback loop between incidents, change management, and service design for ongoing improvement.
Compute, Network & Platform Reliability Accountabilities
- Accountable for overseeing hybrid infrastructure reliability, including compute clusters, virtualization and hardware orchestration platforms, cloud and on‑premises VMs, and containerized workloads.
- Ensure optimal network connectivity, routing, and load balancing between on‑premises and cloud environments supporting analytics and data services applications.
- Manage capacity forecasting, patching schedules, and automation of routine maintenance using Infrastructure‑as‑Code (Terraform, Ansible, Helm and ArgoCD).
- Monitor latency, throughput, and performance metrics to prevent degradation of critical systems.
- The health of data analytics services should be continuously monitored through highly granular plus fully automated observability and telemetry services. This constant monitoring has the goal of preventing end‑user detectable service degradations or interruptions, thus ensuring Cencora’s supply chain customers and partners’ SLAs are consistently met or surpassed.
Information Security Architecture & Operations
- Direct a team of IAM, RBAC, and ABAC engineers ensuring secure, role‑based, and auditable access control across all environments.
- Oversee the Security Operations Center (SOC), ensuring 24×7 threat detection, vulnerability management, and security incident response.
- Collaborate with Cencora’s enterprise CISO and InfoSec teams to align reliability and security objectives with corporate frameworks (NIST, ISO, HIPAA).
- Ensure audit readiness and compliance with HIPAA, GxP, 21 CFR Part 11, and ISO 27001 certifications.
Observability, Telemetry & Automation
- Lead the observability and telemetry automation team responsible for implementing tools such as Grafana, Prometheus, Datadog, Elastic, and equivalent monitoring ecosystems.
- Enable end‑to‑end system visibility across data pipelines, analytics applications, and infrastructure layers.
- Develop predictive alerting and automated root‑cause correlation through AI/ML‑driven observability tools.
- Partner with Data Operations, QA, and DevOps leaders to integrate observability into CI/CD and data lifecycle automation pipelines.
Kubernetes, Cloud Operations & Disaster Recovery
- Oversee the enterprise Kubernetes and container management platform, ensuring proper configuration, scaling, and cluster health.
- Direct DR and business continuity teams to validate replication, cross‑region backups, and failover mechanisms that meet or exceed RPO/RTO targets.
- Coordinate hybrid‑cloud disaster recovery exercises to verify production resilience and recovery automation.
Cross‑Functional Collaboration & Strategic Alignment
- Represent Site Reliability Engineering on the Core Product Governance Council and senior architecture forums across the Company.
- Collaborate with DevOps, Data Operations, QA, and Supply Chain Analytics leaders to align to reliability, observability, and security objectives.
- Coordinate with Finance and Procurement on vendor management, licensing, and capacity cost optimization.
- Establish transparent communications with executive stakeholders on uptime, service health, and operational risks.
Measurable Outcomes & Success Metrics
- ≥ 99.95% uptime across all critical data and analytics environments.
- Mean Time to Detect (MTTD) < 5 minutes and Mean Time to Resolve (MTTR) < 30 minutes for critical incidents.
- 100% compliance with enterprise security and audit requirements.
- Continuous reduction in unplanned downtime, service degradations, and recurring incidents to ensure optimal service delivery.
- Complete implementation of unified observability dashboards covering all shared services.
- Validated disaster‑recovery exercises achieving documented RPO/RTO objectives at least twice a year.
- Increased SRE adoption across Data Services teams as measured by reliability KPIs.
Qualifications & Leadership Competencies
- Bachelor's or master's degree in computer science, Information Systems, or related technical field.
- 10+ years of progressive IT operations or infrastructure experience with at least 5 years leading hybrid‑cloud SRE or reliability teams.
- Expertise in SRE methodologies, reliability frameworks, and hybrid‑cloud operations (Azure, AWS, GCP).
- Advanced knowledge of Linux, Windows Server and/or Unix systems, networking, and Kubernetes orchestration in production environments.
- Hands‑on experience implementing Infrastructure‑as‑Code (Terraform, Helm, Ansible, ArgoCD) and CI/CD pipelines (using GitHub Actions automation experience preferred).
- Deep understanding of IAM, RBAC, ABAC, and enterprise identity governance.
- Proficiency in observability and monitoring tools (e.g. Grafana, Prometheus, Datadog, ELK, Azure Monitor).
- Proven track record leading Security Operations, vulnerability management, and incident response programs.
- Strong communication, leadership, and cross‑functional collaboration skills.
Preferred Certifications
- Certified Kubernetes Administrator (CKA) or Kubernetes Security Specialist (CKS).
- AWS Certified DevOps Engineer or Azure Solutions Architect Expert.
- Certified Information Systems Security Professional (CISSP) or CISM.
- ITIL 4 Foundation for Reliability and Service Management.
Language Requirements
Fully fluent in English is a must (writing, reading, listening and speaking). Bilingual (English & Spanish) is preferred.
Strategic Impact
The Director, Site Reliability Engineering Shared Services, ensures that every data, analytics, and application platform at Cencora operates with continuous reliability, robust security, and proactive observability. As the organizational SRE champion, this leader drives cultural and technical adoption of reliability engineering principles across all Shared Services. By integrating hybrid‑cloud infrastructure operations with world‑class SRE practices, they enable Cencora to deliver uninterrupted, high‑performing analytics services that support global healthcare partners and patients.
Our Commitment
We are united in our responsibility to create healthier futures and value diversity in all its forms. We believe innovation thrives through collaboration, diverse perspectives, and a shared purpose to create healthier futures worldwide.
Equal Employment Opportunity
Cencora is committed to providing equal employment opportunity without regard to race, color, religion, sex, sexual orientation, gender identity, genetic information, national origin, age, disability, veteran status or membership in any other class protected by federal, state or local law.
The company’s continued success depends on the full and effective utilization of qualified individuals. Therefore, harassment is prohibited, and all matters related to recruiting, training, compensation, benefits, promotions and transfers comply with equal opportunity principles and are non‑discriminatory.
Cencora is committed to providing reasonable accommodation to individuals with disabilities during the employment process, which is consistent with legal requirements. If you wish to request accommodation while seeking employment, please call 888.692.2272 or email hrsc@cencora.com. We will make accommodation determinations on a request‑by‑request basis. Messages and emails regarding anything other than accommodation requests will not be returned.
Summary List
- Director, Site Reliability Engineering (SRE) Shared Services Leader
- About Cencora
- Position Summary
- Primary Responsibilities
- Strategic Leadership & Reliability Governance
- Team Leadership & Operational Excellence
- Compute, Network & Platform Reliability Accountabilities
- Information Security Architecture & Operations
- Observability, Telemetry & Automation
- Kubernetes, Cloud Operations & Disaster Recovery
- Cross‑Functional Collaboration & Strategic Alignment
- Measurable Outcomes & Success Metrics
- Qualifications & Leadership Competencies
- Language Requirements
- Strategic Impact
- Our Commitment
- Equal Employment Opportunity