Enable job alerts via email!

Associate Director, Operations - GPU Cloud

Singtel Group

Singapore

On-site

SGD 120,000 - 180,000

Full time

3 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in telecommunications seeks an Associate Director, Operations for its GPU Cloud segment. This pivotal role involves overseeing the GPU Infrastructure-as-a-Service platform, managing operations, and ensuring compliance with service level agreements while leading a high-performing team. The successful candidate will excel in operational management and possess a deep understanding of cloud infrastructure strategies.

Benefits

Full suite of health and wellness benefits

Ongoing training and development programs

Internal mobility opportunities

Qualifications

Proven track record in managing complex cloud and data centre infrastructure.
Experience in liquid cooling operations preferred.
Strong understanding of hardware infrastructure operation and security.

Responsibilities

Manage GPU Infrastructure and operations to optimize performance and cost.
Lead and mentor operations teams to achieve SLA compliance and operational excellence.
Develop operational strategies for GPU Cloud infrastructure reliability.

Skills

Leadership

Communication

Problem-Solving

Operational Management

Linux Administration

Security Management

Technical Expertise

Select how often (in days) to receive an alert:

Associate Director, Operations - GPU Cloud

To lead and manage the GPU Infrastructure-as-a-Service (IaaS) platform. This role will oversee the GPU infrastructure, storage infrastructure and associated services, ensuring seamless integration and operation.

Infrastructure and Resource Management:

Manage the maintenance and operations of Data centre with liquid cooling setup that hosts the GPU cloud.
Optimization of GPU infrastructure and associated hardware.
Optimize resource allocation to meet the performance requirements of both data centre operations and cloud hardware operations, as well as cost-effectiveness goals.
Lead the operations team to ensure compliance to the SLA needs of customers and the product.
Enhance system scalability and reliability through automation and continuous improvements. Enforce industry-standard operational process with reference to standards like ISO 27001 or equivalent in the data centre and cloud operations

Operational Excellence:

Handle general incidents, including operations management and escalation management across the AI cloud product.
Develop and implement operational strategies to ensure the reliability and efficiency of our GPU Cloud infrastructure.
Collaborate with other departments to streamline processes, enhance customer experience, and meet service level agreements.
Support services and improve the lifecycle of GPU cloud hardware and the data centre environment with monitoring, logging, and alerting through deployment, operation, and refinement.
Establish Ops systems/processes (SOPs, EOPs etc) and to manage daily operational issues.
Possess strong operational management skill set, which involves organising the internal cross functional teams and external vendors to ensure an efficient and resilient ops setup.

Team Management:

Build and lead a high-performing operations team to foster a culture of innovation, collaboration, and continuous improvement.
Set clear goals and objectives, mentor team members, and drive professional development initiatives.
Oversee resource management and allocation to optimize team productivity and effectively meet operation goals.

Security and Compliance:

Lead security incident management processes, focusing on identification, containment, and resolution of threats in the data center environment and GPU cloud hardware.
Enforce best practices for security and compliance.
Stay abreast of industry security trends and implement measures to safeguard customer data and platform integrity.

Skills for Success

Proven track record of managing and escalating complex cloud and data centre infrastructure issues and leading operation teams.
Experience in liquid cooling operations would be great
Strong understanding of hardware infrastructure operation, security, management, and best practices.
Excellent leadership, communication, and interpersonal skills, with the ability to lead cross-functional teams.
Proficiency in managing customer interactions and improving service delivery to enhance customer experience.
Experienced in Linux and hypervisor administration for GPU infrastructure and cloud.
Complex technical problem-solving with a proactive approach to system operation and optimization.
Knowledge of storage technologies and experience in capacity planning, troubleshooting, and data protection.
Experience in GPU and GPU infrastructure management, including configuration, monitoring, and performance.

Rewards that Go Beyond

Full suite of health and wellness benefits
Ongoing training and development programs
Internal mobility opportunities

Your Career Growth Starts Here. Apply Now!

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Associate Director, Operations - GPU Cloud

Singtel Group

Singapore

On-site

SGD 120,000 - 180,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Job description