As part of the global Oracle Cloud Strategic Solutions Engineering team, you will be continually challenged and have an opportunity to contribute to solution success every day.
The role provides a mixture of production platform ownership as well as engineering. You will solve meaningful technical problems, identify improvements and work on implementing your recommendations. You will also work directly with high-level developers on projects and work to blur the lines between traditional system operations and development support.
As a Service Reliability Engineer, you will be responsible for defining and deploying key Services with deep focus on architecture, production operations, capacity planning, performance management, deployment, and release engineering. You will work with multiple cross-functional teams helping deliver new and outstanding experiences to our collaborators while ensuring reliability and performance.
In this role, you will need to:
- Take ownership of the architecture, analysis, design, implementation and production operations of a wide array of Core System Framework solutions
- React to production deficiencies by continuously implementing automation, self-healing, and real-time monitoring to production systems
- Be a strong contributor to development of platform services including architecture, provisioning, configuration, deployment, and support
- Partner with the distributed team in prototyping new platform services
- Stay informed of new technologies
- Innovate
Responsibilities
The ideal candidate will have the following:
- Experience in running infrastructure/applications built on any two of the following technology groups
- Oracle Fusion Middleware stack especially WebLogic, SOA or any other J2EE application servers, Web Servers etc. Need to have good hands-on with deep insights into the JVM internals with ability to have them monitored via automation
- Proven operations experience with Linux platform (i.e. RHEL, OEL) including administration, management, and troubleshooting
- Wide array of technologies for scripted and orchestrated automation
- Open Source Technology -Hadoop, Cassandra, Big Data, and Docker etc.
- Strong communication and analytical skills
- Able to accurately estimate efforts and deliver on time
- Experience with agile processes and general understanding of product development
- Understanding and experience with CI/CD practices
- Familiarity with security practices in web application delivery
- Deep understanding of virtualization solutions and Cloud services
- Strong knowledge of Linux-based OS internals
- Software Development Experience Any platform (Oracle, AWS, Azure…)
- Experience with configuration management tools
- Solid understanding of internet protocols
- Experience in fixing network services
- Proven capability in crafting, implementing or supporting high performance and large scale systems
- Experience with scripting knowledge in perl/shell/python etc
- Able to communicate difficult problems and complex solutions clearly and effectively
- Excellent analytic and problem solving ability
- Ability to thrive within a globally distributed engineering team
- Quick learner and ability to instruct/mentor others
- Self-motivated with ability to prioritize and multi-task
Desired Skills and Experience
- Prior experience as a Cloud Operations Engineer or DevOps Engineer
- Development background in any discipline is helpful
- Experience in Containerization Platform, Kubernetes, Docker, Microservices
- Experience in fixing performance problems across Network, Web Tier, Mid-Tier and Database
- Experience with automated service deployment tools
- Experience with automated configuration management tools (like Chef, Ansible, Puppet)