
Enable job alerts via email!
Generate a tailored resume in minutes
Land an interview and earn more. Learn more
A leading technology firm in Singapore is looking for an experienced individual to join their team in developing large-model training systems and optimizing AI infrastructure. Applicants should have a Master's degree in Computer Science or related fields and over 3 years of relevant experience in distributed systems and AI infrastructure. The role includes significant responsibilities in system architecture and production-level reliability, along with hands-on work with GPU clusters and emerging training paradigms.
Join our team, to scale our next-generation large-model training systems and AI infrastructure. This role sits at the intersection of distributed systems, GPU clusters, networking, and large-scale model training, with end-to-end ownership from system architecture to production-level reliability. The role also involves cross-institution collaboration, and long-term technical strategy.