Job Search and Career Advice Platform

Enable job alerts via email!

Senior AI Engineer (Agentic Systems & Inference)

COGNNA

Al Khobar

On-site

SAR 200,000 - 300,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm in Saudi Arabia is looking for a Senior AI Engineer to architect autonomous systems and high-scale inference infrastructures. You will design multi-agent systems, lead the fine-tuning of specialized models, and ensure robust production inference services. The ideal candidate has over 5 years of experience in AI/ML engineering, a degree in Computer Science, and is proficient in technologies like Google Cloud and Kubernetes. This role includes competitive compensation and a chance to make a global impact.

Benefits

Competitive package - Salary + equity options
Onsite experience in Riyadh
Growth-focused work environment

Qualifications

  • 5+ years in AI/ML Engineering or Backend Systems.
  • Expertise in building autonomous systems.
  • Proficient in using Google Cloud services.

Responsibilities

  • Design multi-agent systems using frameworks like Google ADK.
  • Lead architectural strategy for fine-tuning models.
  • Architect high-concurrency inference services.

Skills

Autonomous orchestration design
Cognitive optimization
Distributed inference systems
Programming in Python
KV-cache management
Cloud AI Proficiency

Education

B.S/M.S. in Computer Science or related fields

Tools

Google Cloud (Vertex AI)
Kubernetes
Job description

As a Senior AI Engineer, you will be the primary architect of Cognna’s autonomous agent reasoning engine and the high-scale inference infrastructure that powers it. You are responsible for building production-grade reasoning systems that proactively plan, use tools, and collaborate. You will own the full lifecycle of our specialized security models, from domain-specific fine-tuning to architecting distributed, high-throughput inference services that serve as Security-core intelligence in our platform.

Key Responsibilities
Agentic Architecture & Multi-Agent Coordination
  • Autonomous Orchestration: Design stateful, multi-agent systems using frameworks like Google ADK.
  • Protocol-First Integration: Architect and scale MCP servers and A2A interfaces, ensuring a decoupled and extensible agent ecosystem.
  • Cognitive Optimization: Develop lean, high-reasoning microservices for deep reasoning, optimizing context token usage to maintain high planning accuracy with minimal latency.
Model Adaptation & Performance Engineering
  • Specialized Fine-Tuning: Lead the architectural strategy for fine-tuning open-source and proprietary models on massive cybersecurity-specific telemetry.
  • Advanced Training Regimes: Implement Quantization-Aware Training (QAT) and manage Adapter-based architectures to enable the dynamic loading of task-specific specialists without the overhead of full-model swaps.
  • Evaluation Frameworks: Engineer rigorous, automated evaluation harnesses (including Human annotations and AI-judge patterns) to measure agent groundedness and resilience against the Security Engineer’s adversarial attack trees.
Production Inference & MLOps at Scale
  • Distributed Inference Systems: Architect and maintain high-concurrency inference services using engines like vLLM, TGI, or TensorRT-LLM.
  • Infrastructure Orchestration: Own the GPU/TPU resource management strategy.
  • Observability & Debugging: Implement deep-trace observability for non-deterministic agentic workflows, providing the visibility needed to debug complex multi-step reasoning failures in production.
Advanced RAG & Semantic Intelligence
  • Hybrid Retrieval Architectures: Design and optimize RAG pipelines involving graph-like data structures, agent-based knowledge retrieval and semantic searches.
  • Memory Management: Architect episodic and persistent memory systems for agents, allowing for long-running security investigations that persist context across sessions.
Requirements
  • Experience: 5+ years in AI/ML Engineering or Backend Systems. Must have contributed to large-scale AI/ML inference service in production.
  • Education: B.S/M.S. in Computer Science, Engineering, AI, or related fields.
  • Inference Orchestration: KV-cache management, quantization formats like AWQ/FP8, and distributed serving across multi-node GPU clusters).
  • Agentic Development: Expert in building autonomous systems using Google ADK/Langgraph/Langchain and experienced with AI Observervability frameworks like LangSmith or Langfuse. Hands-on experience building AI applications with MCP and A2A protocols.
  • Cloud AI Native: Proficiency in Google Cloud (Vertex AI), including custom training pipelines, high-performance prediction endpoints, and the broader MLOps suite.
  • Programming: Python and experience with high-performance backends (Go/C++) for inference optimization. You are comfortable working in a Kubernetes-native environment.
  • CI/CD: You are comfortable working in a Kubernetes-native environment.
Benefits
  • Competitive Package – Salary + equity options + performance incentives
  • Onsite Experience – Work from our office in Riyadh, KSA
  • Team of Experts – Work with designers, engineers, and security pros solving real-world problems
  • Growth-Focused – Your ideas ship, your voice counts, your growth matters
  • Global Impact – Build products that protect critical systems and data
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.