Ativa os alertas de emprego por e-mail!

Full Stack Engineer

Highbrow Technology Inc

Teletrabalho

BRL 120.000 - 160.000

Tempo integral

Ontem

Torna-te num dos primeiros candidatos

Cria um currículo personalizado em poucos minutos

Consegue uma entrevista e ganha mais. Sabe mais

Resumo da oferta

A technology-focused organization in Brazil is seeking a skilled developer to join their coding team. The ideal candidate will be expected to write and debug production-quality code while collaborating with engineers and researchers. Responsibilities include designing evaluations, developing data generation tools, and investigating output issues with LLMs. Strong experience with coding tools, Linux, and Docker is essential. This position offers a dynamic environment focusing on precision and collaborative problem-solving.

Qualificações

Experience coding in one or more languages in a professional or research environment.
Proven ability to write and debug production-quality code.
Expertise in designing rigorous evaluations.

Responsabilidades

Write, review, and debug code across multiple languages.
Design tasks and evaluation scenarios for coding.
Investigate LLM outputs and identify problems.
Develop scripts, pipelines, and tools for data generation.
Run systematic evaluations to improve model reliability.

Conhecimentos

Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer)

Solid experience with Linux + Bash

Strong with Docker

Advanced Git skills

Solid understanding of testing and QA

Ability to reliably overlap with 8am–12pm PT

Availability - 40 hours per week with 4 hours of overlap with PST.

Role Overview

We’re a coding-focused team at Turing that serves as a research partner for a Frontier AI Lab. Our role is to build coding tasks, evaluations, datasets, and tooling that help train and improve large language models (LLMs).

You’ll write and debug production-quality code, design rigorous evaluations, and build reproducible workflows that generate clean, high-signal data for model training. Attention to detail matters deeply here—small mistakes can cascade into misleading results, so precision and thoroughness are essential. You’ll also collaborate closely with engineers, researchers, and quality owners to align on standards, review work, and continuously raise the quality bar.

If you enjoy solving unusual technical problems, investigating subtle model failures, and working in developer-like environments where correctness, reproducibility, and collaboration matter, this role will keep you very entertained.

What does your day-to-day look like

Write, review, and debug code across multiple languages
Design tasks and evaluation scenarios for coding, reasoning, and debugging
Investigate LLM outputs and identify hallucinations, regressions, and failure modes
Develop scripts, pipelines, and tools for data generation, scoring, and validation
Produce structured annotations, judgments, and high-quality datasets
Run systematic evaluations that help improve model reliability and reasoning

Required Skills

Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer) and strong hands‑on coding experience (professional or research-based) in one or more of:
Solid experience with Linux + Bash, scripting, and automation
Strong with Docker, reproducible environments, and dev containers
Advanced Git skills (branching, diffs, patches, conflict resolution)
Solid understanding of testing and QA (unit, integration, negative, edge‑case focused)
Ability to reliably overlap with 8am–12pm PT

Obtém a tua avaliação gratuita e confidencial do currículo.

ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.

Melhores cidades

Melhores empresas

Ofertas populares