Activez les alertes d’offres d’emploi par e-mail !

Research Engineer - Post Training

Kog AI

Paris

Hybride

EUR 100 000 - 140 000

Plein temps

Il y a 10 jours

Générez un CV personnalisé en quelques minutes

Décrochez un entretien et gagnez plus. En savoir plus

Résumé du poste

A cutting-edge AI startup in Paris is seeking a talented Lead AI Model Architect to enhance their real-time AI capabilities. You will utilize your expertise in fine-tuning large language models (LLMs) and work in a dynamic, hybrid environment. Ideal candidates will have strong proficiency in Python, Pytorch, and data visualization, alongside exemplary communication skills. This role offers top-tier compensation, equity opportunities, and the chance to work with a highly skilled engineering team, aiming to revolutionize the AI landscape.

Prestations

Top-tier compensation

Real ownership (BSPCE)

Unrivaled technical playground

A world-class environment

Impact & autonomy

Prime location & flexibility

Qualifications

2+ years of experience in fine-tuning LLMs to product needs.
Strong communication skills to convert use cases into benchmark ideas.
Technical coding skills in Python and Pytorch.

Responsabilités

Translate business needs into quality standards.
Design, implement, and validate post-training recipes.
Highlight capabilities and limitations of custom architecture.

Connaissances

Fine-tuning LLMs

Data visualization

Technical coding in Python

Communication skills

Outils

Pytorch

MLflow

KOG:

Kog is a European VC-funded startup and real-time AI frontier lab building the world’s fastest AI execution layer, part of the 2030 French Tech cohort.

We are not just optimizing existing libraries; we are bypassing inefficient abstraction layers to rewrite the rules of AI inference. By coding at the Assembly level on high-end GPUs (starting with the AMD MI300X), we unlock raw performance that standard stacks leave on the table.

Our Mission: To enable true real-time AI. We are targeting 10x performance gains through a combination of low-level GPU mastery and novel model architecture. Our goal is to build the sovereign infrastructure that will power the next generation of collaborative AI agents.

Why join now? We have already achieved a 3x to 10x speedup compared to state-of-the-art alternatives (vLLM, TensorRT-LLM) by making breakthroughs in:

Inter-GPU communication & Grid synchronization
Aggressive Kernel fusion
Low-level Memory Access Optimization

About the Model Architecture team:

The team thrives to deliver extremely low-latency inference models. Next-generation applications raise new target performance, and we build and deliver pipelines to match these new challenges.

What you’ll do:

You will translate business needs into quality standards. You will select among existing benchmarks and develop custom ones.
You’ll design, implement, and validate the post‑training recipe.
You’ll highlight the capabilities and limitations of our custom architecture from a Research and Product perspective.
You’ll help prioritize the next research topics, contributing to the continuous improvement of our models and product.

About you:

Must-have:

You have 2+ years of experience in fine‑tuning LLMs to product needs.
You have experience in visualizing and understanding data.
You have strong communication skills and can convert real‑world use cases into clear benchmark ideas and implementations.
You have technical coding skills in Python, Pytorch, and one fine‑tuning framework such as MLflow.

Nice-to-have :

You have a deep understanding of compression and fine‑tuning algorithms and can pick the right one in any given scenario.
You have worked in an HPC environment (SLURM).

What we offer:

Top‑Tier Compensation: We offer a highly competitive salary package (top of the market) tailored to match your expertise and leadership level.
Real Ownership (BSPCE): You aren’t just an employee; you are a partner. We offer significant equity to ensure you share in the startup’s success.
Unrivaled Technical Playground: Work on the bleeding edge of AI hardware. You will have access to the compute power you need (high‑end clusters) to perform your magic.
A world‑class Environment: Join a high‑density talent team of 12 engineers (including 5 PhDs). We value peer‑to‑peer learning, high autonomy, and zero bureaucracy.
Impact & Autonomy: As a Lead, you will have a direct seat at the table to shape our engineering culture and roadmap alongside the CEO.
Prime Location & Flexibility: WeWork offices in the 13th district (near Station F), the heart of Paris’ tech scene. We operate with a hybrid model, punctuated by our “Paris Weeks” for deep work and team bonding (and great afterworks!).

Feel free to apply if you feel like you’re up to the task!

Obtenez votre examen gratuit et confidentiel de votre CV.

ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.

Noté « Excellent » sur la base de 19 323 évaluations