Job Search and Career Advice Platform

Enable job alerts via email!

Cloud Infrastructure Engineer

Assurity Trusted Solutions

Singapore

On-site

SGD 60,000 - 80,000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology company in Singapore is seeking an infrastructure engineer specialized in Kubernetes and Nvidia GPU technology to design and optimize clusters. The role includes managing infrastructure as code, integrating GPU resources, and collaborating with cross-functional teams to ensure performance. Candidates should have a strong background in computer science and experience with cloud-native tools. A meaningful career with growth opportunities awaits.

Benefits

Learning culture
Career growth opportunities

Qualifications

  • Proven experience in designing and managing on-premises infrastructure solutions.
  • Hands-on experience with cloud-native technologies.
  • Solid knowledge of networking concepts and Kubernetes networking models.

Responsibilities

  • Design, deploy, and optimize Kubernetes clusters using Nvidia software.
  • Collaborate with teams to integrate Nvidia GPU resources effectively.
  • Monitor and troubleshoot issues related to Kubernetes clusters.

Skills

Kubernetes management
Nvidia GPU acceleration
Infrastructure as code
Scripting and automation
Containerization (Docker)
Problem-solving
Collaboration skills

Education

Bachelor's degree in Computer Science or Information Technology

Tools

Terraform
Ansible
Chef
SUSE Rancher
RedHat OpenShift
Job description

Assurity Trusted Solutions (ATS) is a wholly owned subsidiary of the Government Technology Agency (GovTech). As a Trusted Partner over the last decade, ATS offers a comprehensive suite of products and services ranging from infrastructure and operational services, authentication services, governance and assurance services as well as managed processes. In a dynamic digital and cyber landscape, where trust & collaboration are key, ATS continues to drive mutually beneficial business outcomes through collaboration with GovTech, government agencies and commercial partners to mitigate cyber risks and bolster security postures.

Responsibilities:

  • Design, deploy, and optimize Kubernetes clusters using the Nvidia software stack to support large language model applications.
  • Collaborate with cross-functional teams to integrate Nvidia GPU resources effectively within Kubernetes environments, ensuring optimal performance.
  • Implement and manage infrastructure as code (IaC) for Nvidia GPU configurations, focusing on scalability and high availability.
  • Monitor, troubleshoot, and resolve issues related to both Kubernetes clusters and Nvidia GPU resources to maintain a reliable and performant infrastructure.
  • Stay abreast of industry best practices and emerging technologies related to Kubernetes and the Nvidia GPU ecosystem.
  • Work closely with development teams to automate deployment processes, leveraging Nvidia GPU capabilities, and streamline workflows.
  • Implement security best practices to safeguard Kubernetes environments, Nvidia GPU resources, and sensitive data.
  • Participate in on-call rotation and provide timely response to incidents, minimizing downtime for language model applications.
  • Contribute to capacity planning and performance tuning activities, considering the demands of large-scale language model applications utilizing Nvidia GPU acceleration.
  • Document infrastructure configurations, processes, and procedures, facilitating knowledge sharing and team member onboarding.
  • Bachelor's degree in Computer Science, Information Technology, or a related field.
  • Proven experience in designing, implementing, and managing on-premisesinfrastructure solutions.
  • Strong knowledge of server virtualisation, storage systems and networkinfrastructure.
  • Hands-on experience with cloud-native technologies and deployment strategies.
  • Proven experience designing, deploying, and managing Kubernetes clusters such as SUSE Rancher, RedHat OpenShift
  • Strong understanding of containerization concepts such as Docker, orchestration tools like Kubernetes and Nvidia GPU acceleration technologies.
  • Proficiency in scripting, automation and configuration management using tools such as Chef, Ansible, Terraform, or similar.
  • Familiarity with infrastructure-as-code principles and tools (e.g., Helm, Kubernetes manifests).
  • Experience with large-scale language model applications, particularly leveraging Nvidia GPU acceleration, is highly desirable.
  • Solid knowledge of networking concepts, Kubernetes networking models, and integration with Nvidia GPU resources.
  • Excellent problem-solving and troubleshooting skills, with a proactive approach tosystem optimization.
  • Strong communication skills for effective collaboration in a team-oriented, agileenvironment.

Join us and discover a meaningful and exciting career with Assurity Trusted Solutions!

The remuneration package will commensurate with your qualifications and experience. Interested applicants, please click "Apply Now".

We thank you for your interest and please note that only shortlisted candidates will be notified.

By submitting your application, you agree that your personal data may be collected, used and disclosed by Assurity Trusted Solutions Pte. Ltd. (ATS), GovTech and their service providers and agents in accordance with ATS’s privacy statement which can be found at: https://www.assurity.sg/privacy.html or such other successor site.

  • A wholly-owned subsidiary of GovTech.
  • We promote a learning culture and encourage you to grow and learn.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.