Job Search and Career Advice Platform

Enable job alerts via email!

Senior Solution Consultant, Generative AI / Storage Solution

MaiStorage

Puchong

On-site

MYR 150,000 - 200,000

Full time

2 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology solutions provider in Selangor, Malaysia, is seeking a Senior Solution Consultant for Generative AI and storage solutions. The role involves collaborating with clients to design tailored infrastructure, leading technical proposals, and overseeing deployment of high-performance AI systems. Applicants should have a bachelor's degree in Computer Science and strong knowledge of AI server architecture, along with technical skills in Linux and storage solutions. This position offers an opportunity to work at the forefront of AI technology.

Qualifications

  • Solid understanding of the Generative AI landscape and HPC infrastructure.
  • Ability to articulate architectural concepts to C-level executives and IT Directors.
  • Deep familiarity with server components including Enterprise CPUs and Data Center GPUs.

Responsibilities

  • Partner with clients to understand their Gen AI workloads and propose solutions.
  • Act as a bridge between clients and engineering teams to architect AI clusters.
  • Lead technical proposals including BOM, capacity planning, and TCO analysis.

Skills

GPU Architecture
Cluster Management
Advanced Linux command line proficiency
High-Performance Storage
Familiar with Parallel File Systems
Knowledge of S3-compatible object storage
Docker/Containerization

Education

Bachelor's degree in Computer Science or related field

Tools

Slurm
Kubernetes
RHEL
Ubuntu Server
Job description
Senior Solution Consultant, Generative AI / Storage Solution

Consultative Solution Design: Partner with clients to understand their specific Gen AI workloads, dissecting business goals to propose tailored on‑premise infrastructure solutions (Compute, Storage, and Networking).

Technical Architecture & Collaboration: Act as the bridge between the client and our internal Engineering/R&D teams to architect robust AI clusters. Translate client requirements into technical specifications and feasible system designs.

Infrastructure Sizing & Proposal: Lead the creation of technical proposals, including Bill of Materials (BOM), capacity planning (storage/compute sizing), and total cost of ownership (TCO) analysis.

On‑Premise Deployment & Integration: Oversee and assist in the hardware installation, rack configuration, and software stack deployment of high‑performance AI systems and storage servers.

Technical Troubleshooting: Diagnose complex interoperability issues between AI accelerators (GPUs), storage fabrics, and software layers with the assistance of the Engineering team.

Documentation & Knowledge Transfer: Maintain detailed documentation of solution architectures, proof‑of‑concept (PoC) results, and technical resolutions.

Requirements

Education: Bachelor's degree or equivalent in Computer Science, Data Science, Computer Engineering, or a related field.

Gen AI & AI server Knowledge: Solid understanding of the Generative AI landscape (LLMs and multi‑modal) and the High‑Performance Computing (HPC) infrastructure required to train/run them.

Communication: Ability to articulate complex architectural concepts (e.g., cluster networking, storage throughput) to both C‑level executives (layman) and IT Directors (technical).

Hardware Fluency: Deep familiarity with server components including Server Motherboards, Enterprise CPUs (AMD EPYC/Intel Xeon), Data Center GPUs (NVIDIA H100/A100/L40s), High‑speed RAM, and PCIe/NVLink interconnects.

Storage Expertise: Proven understanding of storage requirements for AI, including differences between Block, File, and Object storage, and the importance of IOPS/Throughput in model training.

Problem Solving: Strong analytical skills to troubleshoot bottlenecks in hardware performance or software compatibility.

Technical Skills
1. AI Server & Compute Infrastructure:
  • GPU Architecture: Knowledge of Multi‑GPU configurations.
  • Cluster Management: Familiarity with HPC scheduling tools (Slurm) or container orchestration (Kubernetes/K8s) for AI workloads.
  • Linux Mastery: Advanced Linux command line proficiency (RHEL, Ubuntu Server), including kernel tuning and driver installation (NVIDIA Drivers, CUDA Toolkit).
2. Storage Server & Data Management:
  • High‑Performance Storage: Understanding of NVMe and NVMe‑oF (NVMe over Fabrics) for low‑latency data access.
  • File Systems: Familiarity with Parallel File Systems used in AI (e.g., Lustre, GPFS/IBM Spectrum Scale, BeeGFS) or high‑performance NAS (ZFS).
  • Object Storage: Knowledge of S3‑compatible object storage for large datasets (e.g., MinIO, Ceph).
  • RAID & Data Protection: Configuration of HW/SW RAID (0, 1, 5, 6, 10) for redundancy and performance optimization.
3. DevOps & MLOps:
  • Docker/Containerization (building and deploying AI containers).
  • Basic understanding of CI/CD pipelines for model deployment.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.