Overview
This role combines deep technical skill with leadership and service ownership. You’ll act as a subject matter expert (SME), guiding teams through complex issues, leading infrastructure initiatives, and driving service improvements through automation and modernization.
With 7+ years of experience, you're not just supporting systems, you’re shaping the future of Oracle’s production services.
Qualifications
Career Level - IC4
Responsibilities
- Lead the installation, maintenance, monitoring, and optimization of production server infrastructure across Oracle Cloud.
- Act as a technical escalation point for highly complex issues, including coordinating cross-functional teams and third-party vendors to resolve incidents.
- Represent the infrastructure team as a technical SME on major incidents, service calls, and cross-org initiatives.
- Contribute to the evolution of SLOs, SLAs, and KPIs for services you support, driving reliability and performance at scale.
- Standardize, automate, and improve operational processes and system efficiency using your Linux and scripting expertise.
- Own or co-own key service improvement projects, from roadmap ideation to post-deployment impact analysis.
- Assist with patching, OS/application upgrades, bug fixes, and hardware/software lifecycle management.
- Provide rotational on-call support to ensure high availability of services across a 12-hour, 7-day coverage model.
What We’re Looking For
- Expert-level experience with Linux system administration in large-scale or cloud-native environments
- Strong scripting skills in Python and Bash/Shell
- Solid understanding of networking fundamentals
- Proficiency with monitoring tools such as Grafana, New Relic, Prometheus, etc.
- Demonstrated success in supporting production environments with high performance and availability expectations
Nice to Have
- Familiarity with Oracle Cloud Infrastructure (OCI) or similar cloud platforms (AWS, Azure, GCP)
- Experience defining or evolving KPIs, SLOs, and SLAs
- Exposure to infrastructure automation tools or Infrastructure-as-Code (e.g., Terraform, Ansible)