Job Search and Career Advice Platform

Enable job alerts via email!

Voice Recognition Engineer – Browser-Based Speech Interfaces

New York Technology Partners

Remote

USD 90,000 - 120,000

Full time

2 days ago
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A prominent technology recruitment firm is seeking a Senior Technical Recruiter/Trainer to develop and enhance voice recognition functionality across multiple browsers. The ideal candidate will have hands-on experience with the Web Speech API and familiarity with major speech frameworks. This remote contract position is suitable for individuals with over 3 years of experience in voice UI or audio processing, focused on optimizing performance and user experience in diverse environments.

Qualifications

  • Hands-on experience with Web Speech API and commercial speech frameworks.
  • 3+ years in speech recognition, voice UI, or audio processing.
  • Understanding of latency, privacy, and security in voice processing.

Responsibilities

  • Develop and optimize voice recognition functionality across browsers.
  • Ensure user experience across different environments.
  • Collaborate to build intuitive voice interactions.

Skills

Familiarity with Web Speech API
Experience in speech recognition
Effective communication with cross-functional teams
Prototyping and problem-solving
Multilingual performance optimization

Tools

ElevenLabs
Deepgram
OpenAI Whisper
Amazon Transcribe
Job description
Senior Technical Recruiter/Trainer @ New York Technology Partners | Resume Writer

Position Type: Contract

Location: Remote

Key Responsibilities
  • Develop and optimize voice recognition functionality across Chrome, Edge, Safari, Firefox, and Brave.
  • Ensure consistent performance, compatibility, and user experience across desktop, laptop, mobile, and tablet environments.
  • Customize and extend the Web Speech API and integrate third‑party speech frameworks, including (but not limited to):
  • ElevenLabs (Scribe)
  • Deepgram
  • OpenAI Whisper API
  • Amazon Transcribe / Polly
Performance, Accuracy & Resilience
  • Optimize recognition speed, accuracy, and robustness, especially in noisy or low‑bandwidth environments.
  • Conduct benchmarking and tuning for real‑world usage scenarios across diverse accents, languages, and acoustic conditions.
User Experience & Accessibility
  • Collaborate with product and design teams to build intuitive, inclusive voice interactions.
  • Support configurable speech duration thresholds and accessibility best practices for users with varying abilities.
  • Partner with technical leads and product managers to align voice capabilities with product roadmap.
  • Support client‑facing pilots, demos, and proof‑of‑concept initiatives.
Ideal Candidate Profile
  • API Tailor: Deep familiarity with Web Speech API and at least one major commercial speech‑to‑text platform.
  • Accuracy‑Focused: Passionate about refining speech models for real‑world reliability, speed, and multilingual performance.
  • Collaborative Partner: Communicates effectively with cross‑functional teams (engineering, product, UX).
  • Innovative Builder: Enjoys prototyping, problem‑solving, and elevating voice interaction beyond basic transcription.
Required Qualifications
  • Must have hands‑on experience with Web Speech API + at least one other commercial speech framework.
  • Implement custom logic for error handling, timeout management, speech completion detection, and multilingual support.
  • Minimum 3+ years of experience in speech recognition, voice UI, or audio processing.
  • Demonstrated work with Web Speech API and at least one of the following: ElevenLabs, AssemblyAI, Deepgram, OpenAI Whisper, Google Cloud STT, Azure Speech, or Amazon Transcribe.
  • Understanding of latency, privacy, and security considerations in client‑side voice processing.
Preferred Qualifications
  • Experience with WebRTC, MediaRecorder API, or AudioContext.
  • Background in natural language understanding (NLU) or voice assistant development.
  • Contributions to open‑source speech or accessibility projects.
Seniority Level

Mid‑Senior level

Employment Type

Contract

Job Function

Information Technology and Engineering

Industries

Research Services

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.