Head of System & Operation (GPU System)
EPS Consultants Pte Ltd
Kuala Lumpur
On-site
MYR 60,000 - 90,000
Full time
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
Job summary
A leading company in IT services is seeking a seasoned operations leader to oversee GPU-related services and technology systems. The role involves strategic oversight, team management, and collaboration with diverse departments to enhance service integration. Candidates should have strong analytical and communication skills, coupled with extensive experience managing complex IT operations.
Qualifications
- Proven experience (10+ years) in an operations or technology leadership role within the IT or cloud services industry.
- Strong understanding of GPU technologies and cloud computing principles.
- Hands-on expertise in CPU/GPU clusters and platforms.
Responsibilities
- Oversee the design and implementation of IT systems ensuring performance of GPU resources.
- Develop operational strategies for GPU-as-a-Service focusing on efficiency.
- Manage vendor relationships and compliance with service level agreements.
Skills
Analytical skills
Troubleshooting skills
Communication
Interpersonal skills
Strategic thinking
Education
Bachelor’s degree in Computer Science
Tools
Kubernetes
CPU/GPU cluster management
- Oversee the design, implementation, and maintenance of IT systems that support operational activities, ensuring high availability and performance of GPU resources.
- Provide technical guidance across complex infrastructure projects.
- Develop and execute operational strategies that align with the company’s goals for GPU-as-a-Service, focusing on scalability, efficiency, and reliability.
- Lead and mentor a diverse team of technology professionals, facilitating a culture of innovation, accountability, and continuous improvement.
- Manage relationships with key vendors and third-party service providers to ensure compliance with service level agreements (SLAs) and industry standards.
- Identify opportunities for process improvements across operations. Implement best practices to enhance productivity, reduce costs, and improve service quality.
- Work closely with product development, sales, and marketing teams to ensure seamless integration of services and alignment with customer needs.
- Ensure all operations comply with relevant laws, regulations, and industry standards related to data protection and service delivery.
- Bachelor’s degree in Computer Science or a related technical field
- Proven experience (10+ years) in an operations or technology leadership role within the IT or cloud services industry.
- Strong understanding of GPU technologies and cloud computing principles.
- Demonstrated experience in managing complex IT systems and operational processes.
- Exceptional analytical and troubleshooting skills
- Understand the Kubernetes environments and be able to run the debugging.
- Familiarity with energy-efficient computing and sustainable data center operations.
- Proven ability to manage priorities in a dynamic, fast-paced environment.
- Hands-on expertise and comprehensive knowledge of CPU/GPU cluster and platform.
- Exceptional communication skills, capable of discussing both technical and non-technical topics with diverse audiences.
- Strong interpersonal skills, with a proven ability to develop professional relationships across business and technical teams.
- Ability to manage multiple projects simultaneously while maintaining attention to detail.
- Knowledgeable in operating and managing processes in CPU/GPU cluster.
- Strategic thinker with the ability to implement innovative solutions that drive business success.
- Excellent documentation skills to effectively articulate technical designs, issues, procedures, and assessments.