¡Activa las notificaciones laborales por email!

Senior AI Inference Engineer

Jobgether

Ciudad de México

A distancia

MXN 1,467,000 - 2,202,000

Jornada completa

Hoy

Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A tech recruitment agency is seeking a Senior AI Inference Engineer to lead AI inference system design for major clients in Media, Entertainment, and Sports. Candidates must have strong expertise in Python, AI/ML systems, and experience with Kubernetes. This fully remote role offers a competitive compensation package and the chance to work on high-impact projects with cutting-edge technologies.

Servicios

Competitive compensation package

Fully remote work

Exposure to high-impact projects

Professional growth opportunities

Inclusive work environment

Formación

Extensive professional experience designing and shipping AI/ML systems in production.
Proven track record of taking AI/ML models from prototype to robust services.
Hands-on experience with computer vision or multi-modal inputs.

Responsabilidades

Architect and implement AI inference services using Python.
Design autonomous AI agents for multi-modal inputs.
Deploy AI services on Kubernetes ensuring reliability.

Conocimientos

Python expertise

AI/ML systems design

Kubernetes experience

Computer vision

Multi-modal inputs

NVIDIA GPU optimization

Vision Language Models

Cloud-native architectures

Herramientas

FFmpeg

GStreamer

AWS

Overview

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior AI Inference Engineer in Latin America.

In this role, you will lead the design and deployment of advanced AI inference systems for high-profile clients in Media, Entertainment, Gaming, and Sports. You will be responsible for translating complex, ambiguous business problems into robust, real-time AI architectures capable of interpreting and reasoning about video and multi-modal content. Working across the full project lifecycle—from early discovery and pre-sales to architecture, implementation, and optimization—you will partner with technical teams and clients to deliver scalable, high-performance solutions on modern GPU and cloud infrastructure. This position requires hands-on expertise, innovation, and the ability to communicate complex technical concepts clearly to diverse stakeholders.

Accountabilities

Architect, implement, and optimize end-to-end AI inference services and agentic pipelines using Python.
Design autonomous AI agents that can interpret, reason about, and act on video and multi-modal inputs.
Integrate Vision Language Models (e.g., GPT-4o, Gemini Pro Vision, LLaVA) into production-grade workflows.
Utilize LLM/agent orchestration frameworks (LangGraph, AutoGen, Semantic Kernel, etc.) to manage complex visual AI tasks.
Deploy and operate AI services on Kubernetes or similar platforms, ensuring reliability and scalability under heavy workloads.
Architect distributed systems on AWS, balancing performance, cost, and resilience.
Optimize workloads for modern NVIDIA GPU architectures (Ampere, Hopper, Blackwell) focusing on real-time, high-throughput media applications.
Produce clear architecture diagrams and technical documentation for both technical and non-technical audiences.
Provide technical leadership and guidance to project teams to ensure fidelity to architectural designs and solution goals.
(Optional) Work with video tooling such as FFmpeg, GStreamer, NVENC/NVDEC, and modern codecs, or deploy AI to edge/hybrid environments.

Requirements

Extensive professional experience designing and shipping AI/ML systems in production, with strong Python expertise.
Proven track record of taking AI/ML models from prototype to robust, low-latency inference services.
Hands‑on experience building agentic systems, especially with computer vision or multi‑modal inputs.
Familiarity with Vision Language Model integration and orchestration frameworks for multi‑modal tasks.
Strong practical experience with Kubernetes and cloud‑native distributed architectures (AWS preferred).
Knowledge of modern NVIDIA GPU architectures and optimization techniques.
Product‑oriented mindset: able to align technical solutions with business outcomes and ROI.
Excellent communication skills for collaborating with technical teams, clients, and C‑level stakeholders.
Self‑starter, able to work independently in ambiguous or rapidly evolving environments.
Nice‑to‑have: experience with FFmpeg, GStreamer, NVENC/NVDEC, OpenShift, NVIDIA Holoscan, Mojo, or AI deployment on edge/hybrid/on‑prem environments.

Benefits

Competitive compensation package.
Fully remote work within North or South America.
Exposure to high‑impact projects with leading global clients in Media, Entertainment, Gaming, and Sports.
Opportunity to work with cutting‑edge AI technologies and modern GPU/cloud infrastructure.
Professional growth through complex, real‑world problem solving.
Inclusive and diverse work environment.

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.

Ciudades destacadas

Empresas destacadas

Vacantes populares