¡Activa las notificaciones laborales por email!

Senior AI Inference Engineer (llama.cpp specialist) - 100% Remote

Tether Operations Limited

A distancia

MXN 1,068,000 - 1,425,000

Jornada completa

Ayer

Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A leading technology firm in Ciudad de México seeks an experienced C++ engineer focused on enhancing AI inference engines for edge devices. Your responsibilities will include deploying machine learning models and collaborating on production environments. An excellent understanding of deep learning and strong C++ skills are essential. Ideal candidates will hold a related degree and have experience with Llama.cpp and similar technologies. This is a great opportunity to contribute to groundbreaking AI research and development.

Formación

Experience with Watch's and LLMs.

Responsabilidades

Deploy machine learning models to edge devices using frameworks like llama.cpp.
Collaborate ... with researchers to assist in transitioning models.
Integrate AI features into existing products with the latest advancements.

Conocimientos

Excellent programming skills in C++

Experience in Javascript

Strong experience with Llama.cpp and ggml

Good understanding of deep learning concepts

Demonstrated ability to rapidly assimilate new technologies

Educación

Degree in Computer Science, AI, Machine Learning, or a related field

You’ll work on the C++ layer that powers local AI, porting and enhancing inference engines like llama.cpp, ONNX and similar, to run efficiently on Нижних devices. Your focus is on the runtime: making models load faster, run leaner, and perform well across different hardware. You’ll ensure that the inference layer is stable, optimized, and ready for integration with the rest of the stack.

This role is for engineers who want to work close to the metal, enabling private and fast on-device AI without relying on cloud infrastructure.

Responsibilities

Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, ONNX
Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments
Integrate AI features into existing products, enriching them with the latest advancements in machine learning

Qualifications=

Excellent programming skills in C++, experience in Javascript is a bonus
Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures
Good understanding of deep learning concepts and model architectures
Experience with Watch's and LLMs
Demonstrated ability to rapidly assimilate new technologies and techniques
A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D

Important information for candidates

Apply only through our official channels. We do not use المرحلة third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page: https://tether.recruitee.com/
Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles. If you’re unsure, you can confirm their identity by checking their profile or contacting us through our website.
Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS. All communication is done through official company emails and platforms.
Double-check email addresses. All communication from us will come from emails ending in @tether.to or @tether.io
We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately.

Consigue la evaluación confidencial y gratuita de tu currículum.

o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.

Ubicaciones

Empresas destacadas

Principales puestos