Job Title: AI/CIO Engineer
Working Hours: Monday - Friday : 8.30am – 6.00pm
Duration: 24-Month
Job Function: IT (Information Technology)
Job Description
Development and Deployment of AI products and AI-Driven Backend services
- Design and develop backend API services for AI-powered functionalities that are reusable and scalable to support business use cases (e.g. OCR, document parsing, embedding model APIs)
- Deploy and manage large-scale LLMs (e.g., LLaMA, Mistral, GPT-based models) in containerized environments using tools like Kubernetes (Redhat OCP preferred) and vLLM and integrate into RESETful or gRPC API endpoints.
- Package, version, document, and deploy AI/ML models/services using MLOPs frameworks like MLflow, Kubeflow, or an enterprise AI/ML tool such as Dataiku.
Inference Optimization
- Tune inference performance using model quantization, tensor parallelism, low-level optimization libraries (e.g., TensorRT, ONNX Runtime, DeepSpeed).
- Implement dynamic batching and request multiplexing for low-latency, high-throughput serving.
- Profile and monitor model inference workloads to identify and remove performance bottlenecks.
AI Infrastructure Management
- Design and manage scalable GPU/accelerator infrastructure (on-prem) for AI training and inference workloads.
- Maintain efficient GPU job scheduling with tools like Redhat OCP
Security and Compliance
- Embed security throughout the AI deployment lifecycle including model validation, image signing, runtime protection, and API security.
- Ensure infrastructure complies with enterprise security standards (e.g., NIST, ISO 27001).
- Collaborate with security and compliance teams to perform threat modeling and secure deployment assessments.
Specific Requirements
- Bachelor’s or Master’s degree in Computer Science, Mathematics, Statistics, or a related field
- 5+ years in DevOps, MLOps, with 2+ years focusing on AI/ML systems.
- Minimum 3 years’ experience in data analytics/data science.
- Strong understanding of AI/MLOps practices and tools
- Hands-on experience deploying and optimizing LLMs or transformer-based models in production.
- Proficiency in container technologies (preferably Redhat OCP)
- Strong scripting and automation skills (Python, Bash, etc.).
- Familiarity with inference optimization techniques (quantization, batching, parallelism).
- Solid understanding of GPU infrastructure and performance tuning.
- Deep knowledge of secure software development practices and DevSecOps tooling
Apply now via MyCareersFuture
Only shortlisted candidates will be contacted.
By submitting your resume or personal data, you consent to BGC Group Pte Ltd collecting, using, and disclosing your personal data to our clients and partners for the purpose of evaluating your suitability for job opportunities and related recruitment services. You acknowledge that you have read, understood, and agree to our Privacy Policy for Job Applicants, available at https://bgcgroup.com/notice-for-job-applicants.
Int Ref: GP -JO-27568
BGC Group Pte Ltd (Outsourcing Division)
EA 05C3053