Enable job alerts via email!

Senior AI Agent Engineer - Voice AI

Zendesk

London

Hybrid

GBP 80,000 - 100,000

Full time

Yesterday
Be an early applicant

Job summary

A leading technology company in London is seeking a Senior Voice AI Agent Engineer to innovate in the field of conversational AI. You will design and develop voice-first AI agents, integrating real-time STT and TTS services. The ideal candidate brings expertise in voice AI applications and can communicate complex concepts to diverse stakeholders. This position offers a hybrid work model, requiring some onsite presence.

Benefits

Flexible work schedule
Diversity and inclusion programs
Community engagement opportunities

Qualifications

  • Strong experience building multi-step, tool-using agents.
  • Experience building low-latency, streaming voice applications.
  • Expertise in integrating and managing real-time STT/TTS models.

Responsibilities

  • Design and develop scalable voice-first AI agents using Python.
  • Integrate Speech-to-Text (STT) and Text-to-Speech (TTS) services.
  • Work cross-functionally with product managers and engineers.

Skills

Voice AI Expertise
Tool Integration & APIs
Performance Optimization
Programming & Deployment

Education

M.S. / Ph.D. in Computer Science, NLP, Machine Learning

Tools

Python
FastAPI
AWS
Job description
Overview

The Agentic Tribe is revolutionizing the chatbot and voice assistance landscape with Gen3, a cutting-edge AI Agent system that's pushing the boundaries of conversational AI. Gen3 is not your typical chatbot; it’s a goal-oriented, dynamic, and truly conversational system capable of reasoning, planning, and adapting to user needs in real time. Leveraging a multi-agent architecture and advanced language models, Gen3 delivers personalized and engaging user experiences, going far beyond scripted interactions to handle complex tasks and “off-script” inquiries with ease.

We are seeking a passionate and experienced Senior Voice AI Agent Engineer with a strong focus on Voice AI to join our team. In this role, You will be dedicated to innovating at the forefront of conversational AI, engineering intelligent, autonomous agents that can listen, understand, and speak with human-like fluidity.

You will build the cognitive architecture for our voice applications, creating systems that can reason, plan, and execute complex tasks through seamless, low-latency spoken dialogue. A key part of your role will be to effectively communicate complex technical concepts to both technical and non-technical stakeholders.

Responsibilities
  • Design and develop robust, stateful, and scalable voice-first AI agents using Python, specifically optimized for real-time voice interactions, managing turn-taking, interruptions, and low-latency responses.
  • Integrate best-in-class real-time Speech-to-Text (STT), Text-to-Speech (TTS), and Voice Activity Detection (VAD) services to create a seamless conversational flow.
  • Connect voice agents with existing enterprise systems, databases, and third-party APIs to create powerful, end-to-end automated workflows initiated and managed through voice.
  • Establish and own the evaluations for voice agent performance and behavior and iterate over time to systematically improve performance, reliability, and the overall user experience.
  • Build end-to-end conversational flows with reasoning, planning, and dynamic tool use — beyond pre-scripted voice experiences.
  • Work cross-functionally with product managers, ML scientists, and engineers to deeply understand user needs and voice interaction goals.
  • Implement fallback, recovery, and error-handling strategies to deal with noisy audio input or speech recognition inaccuracies.
  • Define and track voice-specific evaluation metrics (e.g., word error rate, latency, conversational naturalness).
  • Develop observability tools and guardrails to monitor performance, ensure safety, and handle edge cases in spoken interactions.
  • Document development, architecture decisions, and research findings to share knowledge across the team.
Qualifications
  • LLM-Oriented System Design: Strong experience building multi-step, tool-using agents (LangChain, Autogen). Familiar with prompt engineering, context management, and reasoning strategies like Chain-of-Thought and ReAct.
  • Voice AI Expertise:
    • Experience building low-latency, streaming voice applications. Expertise in integrating and managing real-time STT/TTS models and APIs. Proficient with techniques for Voice Activity Detection (VAD), noise suppression, and implementing robust barge-in/interruption logic.
    • Experience with integrating third-party voice AI APIs, including Speech-to-Text (STT) and Text-to-Speech (TTS) services from providers like OpenAI, Deepgram, ElevenLabs, etc.
    • Understanding of latency, timing, and streaming audio constraints.
  • Tool Integration & APIs: Comfortable connecting agents to external APIs, tools, databases in secure environments.
  • RAG (Retrieval-Augmented Generation): Building pipelines with vector stores, chunking strategies, and hybrid retrieval.
  • Evaluation & Observability: Implementing and using monitoring tools and evaluation frameworks to score AI Agents.
  • Safety & Reliability: Familiarity with techniques for prompt injection defense, guardrails, and failover logic.
  • Performance Optimization: Token budget and latency management using caching, model routing, etc.
  • Programming & Deployment: Expert in Python, FastAPI, and LLM SDKs. Experience deploying AI apps to cloud platforms (AWS, GCP, Azure) using CI/CD best practices.
Nice-to-have
  • M.S. / Ph.D. in Computer Science, NLP, Machine Learning, or related field
  • Background in spoken dialogue systems or conversational UX design.
  • Familiarity with real-time streaming architecture (e.g., WebRTC, gRPC, socket.io).
  • Multilingual ASR/TTS pipeline experience
About Zendesk

Zendesk builds software for better customer relationships. It empowers organizations to improve customer engagement and better understand their customers. Zendesk products are easy to use and implement. They give organizations the flexibility to move quickly, focus on innovation, and scale with their growth.

More than 100,000 paid customer accounts in over 150 countries and territories use Zendesk products. Based in San Francisco, Zendesk has operations in the United States, Europe, Asia, Australia, and South America.

Interested in knowing what we do in the community? Check out the Zendesk Neighbor Foundation to learn more about how we engage with, and provide support to, our local communities.

Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster diversity & inclusion in the workplace. Individuals seeking employment at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, physical or mental disability, military or veteran status, or any other characteristic protected by applicable law.

By submitting your application, you agree that Zendesk may collect your personal data for recruiting, global organization planning, and related purposes. Zendesk's Candidate Privacy Notice explains what personal information Zendesk may process, where Zendesk may process your personal information, its purposes for processing your personal information, and the rights you can exercise over Zendesk’s use of your personal information.

Hybrid: In this role, our hybrid experience is designed at the team level to give you a rich onsite experience packed with connection, collaboration, learning, and celebration - while also giving you flexibility to work remotely for part of the week. This role must attend our local office for part of the week. The specific in-office schedule is to be determined by the hiring manager.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs