Position Summary
As a Network Operations Center (NOC) Systems Specialist, you will be responsible for monitoring production environment, analysing production issues/alerts, identifying the root cause of the issue, resolving L1 issues. You will also collaborate with cross-functional teams to improve NOC standard operating procedures, train the team members on resolving alerts/issues related to applications hosted in cloud and On-Prem.
Key responsibilities
- Review and understand application architecture in hybrid cloud environment
- Work with the application development teams to onboard NOC for production environment
- Monitor production environment
- Create and Implement application Level 1 and Level 2 production issues resolution process
- Monitor, troubleshoot issues, and resolve Level 1 and Level 2 production issues.
- Work in shifts (24X7) to monitor batch jobs and applications hosted in hybrid cloud in production environment.
- Stay up to date with industry best practices and emerging technologies to continuously improve NOC operations.
- Train other team members on different cloud technologies and resolving the issues/alerts
Requirements
- Requires 2-3 years of experience based on consistently demonstrated capabilities
- Experience in working with Java
- Experience in cloud infrastructure management.
- Hands-on experience with Kubernetes for container orchestration.
- Understanding of CI/CD tools, with a focus on GitLab CI/CD.
- Understanding of Batch jobs, Linux systems and networking
- Moderate fluency in at least one scripting language such as Bash or equivalent
- Knowledge of networking fundamentals including TCP/IP, traffic analysis, common protocols, and network diagnostics
- Experience in troubleshooting using Dynatrace
- Excellent problem-solving and communication skills.
- Ability to work effectively in a collaborative, cross-functional team environment.
Education/ Experience
- Bachelor’s degree in information technology, MIS, Computer Science or related field required
- Experience in production environment monitoring and resolving Level 1 and Level 2 issues related to Applications/services hosted in on-prem and public cloud infrastructure (preferably Google Cloud)
- Experience in identifying the issue root-cause of the issue using Dynatrace
- Experience in setting and documenting technology standards for a production support organization