Responsibilities
Platform Development and Evangelism
- Build scalable AI platforms that are customer-facing.
- Evangelise the platform with customers and internal stakeholders.
- Ensure platform scalability, reliability, and performance to meet business needs.
Machine Learning Pipeline Design
- Design ML pipelines for experiment management, model management, feature management, and model retraining.
- Implement A/B testing of models.
- Design APIs for model inferencing at scale.
- Proven expertise with MLflow, SageMaker, Vertex AI, and Azure AI.
LLM Serving and GPU Architecture
- Serve as an SME in LLM serving paradigms.
Model Fine-Tuning and Optimisation
- Demonstrate proven expertise in model fine-tuning and optimisation techniques.
- Achieve better latencies and accuracies in model results.
- Reduce training and resource requirements for fine-tuning LLM and LVM models.
LLM Models and Use Cases
- Have extensive knowledge of different LLM models.
- Provide insights on the applicability of each model based on use cases.
- Proven experience in delivering end-to-end solutions from engineering to production for specific customer use cases.
DevOps and LLMOps Proficiency
- Proven expertise in DevOps and LLMOps practices.
- Knowledgeable in Kubernetes, Docker, and container orchestration.
- Deep understanding of LLM orchestration frameworks like Flowise, Langflow, and Langgraph.
Requirements
- LLM: Hugging Face OSS LLMs, GPT, Gemini, Claude, Mixtral, Llama.
- LLM Ops: ML Flow, Langchain, Langraph, LangFlow, Flowise, LLamaIndex, SageMaker, AWS Bedrock, Vertex AI, Azure AI.
- Databases/Datawarehouse: DynamoDB, Cosmos, MongoDB, RDS, MySQL, PostgreSQL, Aurora, Spanner, Google BigQuery.
- Cloud Knowledge: AWS/Azure/GCP.
- Dev Ops (Knowledge): Kubernetes, Docker, FluentD, Kibana, Grafana, Prometheus.
- Cloud Certifications (Bonus): AWS Professional Solution Architect, AWS Machine Learning Speciality, Azure Solutions Architect Expert.
- Proficient in Python, SQL, and JavaScript.
- LLM Serving and GPU Architecture: Possesses deep knowledge of GPU architectures.
- Expertise in distributed training and serving of large language models.
- Proficient in model and data parallel training using frameworks like DeepSpeed and service frameworks like vLLM.