Overview
Full Stack Data Scientist (Azure AI Engineer) Location: Dubai Experience: 8+ years (Data Science / AI Engineering / Applied ML) Job Type: Full-time Job Summary : We are looking for a highly capable Full Stack Data Scientist / Azure AI Engineer who can build end-to-end AI products: data + ML/DL/CV models + Agentic workflows + APIs + UI + scalable deployment on Kubernetes (AKS). The role requires deep expertise in the Azure AI ecosystem (Azure Machine Learning, Azure AI Foundry, Azure AI Search) and strong hands-on experience building AI agents using LangChain, LangGraph, and/or Microsoft Agent Framework, with Langfuse for tracing, evaluation, and observability. The ideal candidate has shipped production systems with measurable business impact and can operate them reliably through strong MLOps/LLMOps practices.
Responsibilities
- End-to-End AI Product Delivery • Own delivery from problem definition → architecture → development → deployment → monitoring → iterative improvements.
- Translate business needs into robust AI solutions with clear KPIs, timelines, and measurable outcomes.
- Build AI applications that are secure, scalable, maintainable, and production ready.
- AI Agents & Agentic Workflows (Must-Have) • Design, implement, and orchestrate AI agents capable of planning, tool use, function calling, retrieval, and multi-step execution.
- Build agent systems using: o LangChain for tool/function orchestration, retrieval, and integrations
- o LangGraph for stateful, multi-step, resilient agent workflows
- o Microsoft Agent Framework for enterprise-grade agent patterns and integrations
- Group IT • Implement agent patterns: routing, task decomposition, multi-agent collaboration, memory, verification, retries/fallbacks, and human-in-the-loop approvals.
- Apply security & safety: prompt-injection defenses, tool permissioning, grounding/citations, policy checks, and audit logs.
- LLMOps / Observability / Evaluation (Langfuse) • Implement Langfuse (or equivalent) for: prompt and trace logging, latency/cost monitoring
- dataset-based evaluation, regression testing, and quality gates
- feedback loops and continuous improvement of prompts/agents
- Establish evaluation frameworks for RAG/agents: retrieval metrics, answer quality, hallucination checks, and guardrail effectiveness.
- Azure Machine Learning & MLOps (Must-Have) • Build/operate ML workflows using Azure Machine Learning: training jobs, compute, environments, pipelines, MLflow tracking
- model registry and promotion, managed online endpoints
- Implement CI/CD for model + application releases and MLOps practices: versioning, reproducibility, automated testing, and retraining triggers.
- Azure AI Foundry & Azure AI Search (Must-Have) • Build GenAI solutions using Azure AI Foundry (prompt flows/orchestration, deployment integration, evaluation workflows).
- • Implement RAG pipelines using Azure AI Search: ingestion/indexing of structured & unstructured data
- vector + hybrid search, semantic ranking (where applicable), filtering, and relevance tuning
- citations, metadata-based access control, and indexing automation
- 6) ML/DL & Computer Vision (Strong Requirement) • Develop and deploy strong ML/DL solutions including Computer Vision: classification, detection, segmentation, OCR/document understanding, anomaly/defect detection
- • Conduct experimentation, tuning, and optimization (performance, robustness, cost).
- • Productionize CV pipelines with monitoring and continuous improvement.
- Backend/API Engineering (FastAPI + Node.js) • Build production APIs for models and agents using FastAPI (Python) (async, OpenAPI/Swagger, auth, middleware, validation).
- • Build service orchestration and integrations using Node.js where appropriate.
- • Implement secure API patterns: authentication/authorization (Azure AD/RBAC patterns), rate-limiting, caching, and error handling.
- Frontend Engineering (React) • Build modern UIs in React for AI applications (agent chat UI, dashboards, workflow screens).
- • Support streaming responses, citations, session memory, feedback capture, and user analytics.
- Kubernetes/AKS Deployment & Operations • Containerize services using Docker and deploy on Kubernetes (AKS preferred).
- • Implement scaling, rollouts, secrets/config management, ingress, and reliability patterns.
- • Set up monitoring/telemetry using Azure Monitor/App Insights (or equivalent), alerts, and runbooks.
Qualifications
- Mandatory Certifications (Must) • AI-102: Microsoft Certified – Azure AI Engineer Associate
- • DP-100: Microsoft Certified – Azure Data Scientist Associate
- Core Technical Skills • Agents/Frameworks: Strong hands-on experience with LangChain, LangGraph, and Microsoft Agent Framework
- • LLMOps: Strong experience with Langfuse for tracing/evaluation/monitoring (or equivalent tooling, with Langfuse preferred).
- • Azure: Azure ML, Azure AI Foundry, Azure AI Search; plus Key Vault, Storage, App Insights/Monitor as needed.
- • Programming: Strong Python; API development with FastAPI ; Node.js for services/integrations.
- • Frontend: React for production UI development.
- • ML/DL/CV: Proven hands-on depth in ML/DL and Computer Vision.
- • Deployment: Docker + Kubernetes/AKS.
- Group IT • Data: Strong SQL; experience with structured + unstructured data.
Preferred Qualifications
- Experience in real estate / construction domain AI use cases (valuation, forecasting, risk, customer support automation).
- Exposure to graph databases (e.g., Neo4j) and vector search/vector databases for AI applications.
- Extra certifications (nice-to-have): Azure Fundamentals (AZ-900), Azure Developer (AZ-204), Kubernetes (CKA/CKAD), Databricks ML.
What Success Looks Like
- Delivered production-grade AI solutions end-to-end: data → model → agentic workflow → API → UI → AKS deployment → monitoring.
- Established strong LLMOps with Langfuse: traceability, evaluation, cost controls, and reliability improvements.
- Built reliable, secure, observable systems with measurable business impact (time saved, accuracy gains, automation rate, cost reduction).
- Demonstrated strong ownership from POC to production and post-launch iteration.