Enable job alerts via email!
Boost your interview chances
An innovative firm is looking for an AI Cloud Platform System Engineer to enhance and optimize their AI training and inference platforms. This role involves designing scalable solutions for distributed AI/ML systems, optimizing workload distribution across GPU clusters, and integrating cutting-edge frameworks. You will collaborate with AI researchers to improve model architectures and ensure a resilient platform for both training and production workloads. Join a dynamic team that values collaboration and problem-solving, where your contributions will directly impact the efficiency and effectiveness of AI systems in a fast-paced environment.
Job Title: AI Cloud Platform System Engineer
Position Type: Full-Time
Job Summary
We seek an AI Cloud Platform System Engineer to build, scale and optimize LLM training/inference/Data Platform. This role spans distributed training systems, GPU/CPU compute optimization, inference frameworks optimization and data platform for training/inferencing. You will ensure a resilient, cost-efficient platform for both training and production inference workloads, leveraging Kubernetes-native solutions.
Key Responsibilities
Preferred Qualifications
Mid-Senior level
Full-time
Information Technology and Research
Research Services