Role: Senior Lead Infrastructure Engineer
Type: Remote - working EST Hours
Clearance: Must be eligible for up to a Top-Secret Security Clearance.
Job Overview
We are seeking a highly experienced Infrastructure Lead to spearhead the design, deployment, and operational excellence of our modern cloud-native infrastructure. The ideal candidate will possess deep expertise in container orchestration (Kubernetes), distributed storage (Ceph), and robust security principles (OAuth, KeyCloak).
Key Responsibilities
- Lead the infrastructure team in the design, implementation, and maintenance of our core cloud-native platform, including Kubernetes, Ingress / Egress, and related technologies.
- Drive automation and configuration management using advanced tools; specifically, utilize Helm for packaging, deployment, and lifecycle management of applications on Kubernetes in a production environment.
- Develop and maintain operational tooling, custom integrations, and system automation scripts primarily using Python to streamline deployment pipelines and enhance platform observability.
- Oversee and manage large-scale, resilient storage solutions, with hands-on expertise in administering and optimizing Ceph clusters.
- Design and implement robust Identity and Access Management (IAM) and Single Sign-On (SSO) solutions utilizing KeyCloak, OAuth, and LDAP to ensure secure authentication and authorization across all services.
- Collaborate with teams on secure and efficient network architecture, including configuration of firewalls, VPNs, and managing Ingress and Egress traffic flow.
Requirements
Must-Have
- 10+ years of progressive experience in infrastructure design, implementation, and maintenance, with a strong focus on security and cloud-native environments.
- Kubernetes administration and deployment experience in production environments.
- Developing, managing, and maintaining complex application deployments using Helm charts.
- Distributed, software-defined storage solutions, particularly Ceph.
- Identity and Access Management (IAM), including KeyCloak, OAuth, LDAP.
- Python for automation, system integration, and operational tasks.
- Configuring and managing Ingress controllers and network security.
Nice-to-Have
- Experience with Python for scripting and data analysis.
- Knowledge of network security protocols, specifically IPSec.
- Deep administrative experience with Linux operating systems.