Enable job alerts via email!
Boost your interview chances
OpenText is seeking a Sr. Site Reliability Administrator to join our team in Mississauga. This role is key in building solutions to enhance the availability and performance of our services. The ideal candidate will possess strong problem-solving skills and a passion for ensuring system stability. You will work collaboratively with multiple teams to resolve issues efficiently, automate processes, and drive innovation in information management.
Hiring Manager: Brian Lawrence
Talent Acquisition Advisor: Draun Raval
Job Code Level: IZ-CLD-P3
Refer Your Friends!
AI-First. Future-Driven. Human-Centered.
At OpenText, AI is at the heart of everything we do—powering innovation, transforming work, and empowering digital knowledge workers. We're hiring talent that AI can't replace to help us shape the future of information management. Join us.
YOUR IMPACT
The role of Sr. Site Reliability Administrator is to build solutions to enhance the availability, performance, and stability of OpenText services as well as automate repetitive work as part of a cloud ops organization.
This role would be a great fit for someone with creative and innovative problem-solving skills. You will develop and implement solutions that operate at scale. Our teams are empowered and expected to improve our products to deliver a reliable customer experience.
WHAT THE ROLE OFFERS
• Uses technical knowledge, creativity, and company practices to drive down occurrences of incidents through the development of proactive monitoring and alerting.
• Provide attention to incidents according to Service Level Agreements.
• Provide continuous feedback to development teams on system stability, defect analysis, and system enhancements
• Develop runbooks and patterns to sustain applications in a production environment
• Participate in technical discussions and drive a transition to sustain activities with the development teams
• Work with IT business and development partners to gather input to develop new capabilities in displaying/monitoring/alerting on key performance indicators (KPIs) by tracking business transactions (BT) in real-time
• Partner with application owners to develop creative and effective solutions to mitigate risk and successfully remediate any audit issues, providing quality and timely responses
• Take ownership and accountability for the incident resolution process, participating in RCA and SWAT investigations.
• Plan for validation and verification of changes deployed by infrastructure teams, and development teams.
• Participate in day-to-day real-time advanced-level technical support and troubleshooting on issues reported by the user/customer base.
• Provides guidance in resolving performance-related issues and designing solutions for any technical issues faced by the application
• Establish and maintain a good relationship with team members, Product Development, Product Management, Customer Service, Client management, and other cross-functional teams.
• Requires rotating shift work as needed.
• On-call rotation is required, as 7x24x365 support is required.
WHAT YOU NEED TO SUCCEED
• Hands-on experience in Cloud computing (AWS, Google, Azure)
• The ability to understand and maintain Scripting Software
• Deep understanding of Linux systems
• Good understanding and operational experience with container technologies.
• Good understanding and working experience with microservices and RESTful architecture.
• Strong working knowledge of PaaS or Application operations best practices.
• Good understanding of Scripting and automation knowledge (Jenkins, Python, Ansible, bash scripting)
• Operational understanding or experience with message brokers such as Apache MQ
• Experience in supporting middleware technologies such as Apache, Tomcat, and Spring.
• Experience with at least one scripting language such as shell, Perl, Python, javascript, etc…
• Experience with installing and configuring Apache and Tomcat.
• Experience in supporting Java applications built using frameworks such as Spring, struts, spark, etc.
• Experience and knowledge in Oracle and Postgres.
• Deep expertise in Monitoring distributed systems application architectures and the ability to correlate environment conditions and metrics to application events.
• Experience with Monitoring tools such SiteScope, APM, OpsB, Prometheus
• Knowledge and familiarity of centralized logging systems such as ArcSight, Splunk,etc…
• Strong understanding of ITIL principles, certification is a plus.
• Is passionate about “getting under the hood” of systems and technologies to understand their inner workings, and fix what needs fixing. This requires diagnosing & troubleshooting user facing service incidents & outages
• Diagnosing, resolving problems in high-throughput web applications & network services
OpenText's efforts to build an inclusive work environment go beyond simply complying with applicable laws. Our Employment Equity and Diversity Policy provides direction on maintaining a working environment that is inclusive of everyone, regardless of culture, national origin, race, color, gender, gender identification, sexual orientation, family status, age, veteran status, disability, religion, or other basis protected by applicable laws.
If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please submit a ticket at Ask HR. Our proactive approach fosters collaboration, innovation, and personal growth, enriching OpenText's vibrant workplace.