Job Search and Career Advice Platform

Ativa os alertas de emprego por e-mail!

Full Stack Engineer

Highbrow Technology Inc

Teletrabalho

BRL 120.000 - 160.000

Tempo integral

Ontem
Torna-te num dos primeiros candidatos

Cria um currículo personalizado em poucos minutos

Consegue uma entrevista e ganha mais. Sabe mais

Resumo da oferta

A cutting-edge technology firm in Brazil seeks a candidate for a coding-focused role with a focus on AI research. In this position, you will write and debug production-quality code, design tasks for evaluations, and evaluate outputs of large language models (LLMs). Collaboration with engineers and researchers is vital, and a strong coding background is required. If you enjoy solving technical problems and have experience with LLM coding tools, we encourage you to apply.

Responsabilidades

  • Write, review, and debug code across multiple languages.
  • Design tasks and evaluation scenarios for coding, reasoning, and debugging.
  • Investigate LLM outputs and identify hallucinations, regressions, and failure modes.
  • Develop scripts, pipelines, and tools for data generation, scoring, and validation.
  • Produce structured annotations, judgments, and high-quality datasets.
  • Run systematic evaluations that help improve model reliability and reasoning.

Conhecimentos

Experience using LLM coding tools
Solid experience with Linux + Bash
Strong with Docker
Advanced Git skills
Solid understanding of testing and QA
Ability to reliably overlap with 8am–12pm PT
Descrição da oferta de emprego

Availability - 40 hours per week with 4 hours of overlap with PST.

Role Overview

We’re a coding-focused team at Turing that serves as a research partner for a Frontier AI Lab. Our role is to build coding tasks, evaluations, datasets, and tooling that help train and improve large language models (LLMs).

You’ll write and debug production-quality code, design rigorous evaluations, and build reproducible workflows that generate clean, high-signal data for model training. Attention to detail matters deeply here—small mistakes can cascade into misleading results, so precision and thoroughness are essential. You’ll also collaborate closely with engineers, researchers, and quality owners to align on standards, review work, and continuously raise the quality bar.

If you enjoy solving unusual technical problems, investigating subtle model failures, and working in developer-like environments where correctness, reproducibility, and collaboration matter, this role will keep you very entertained.

What does your day-to-day look like
  • Write, review, and debug code across multiple languages
  • Design tasks and evaluation scenarios for coding, reasoning, and debugging
  • Investigate LLM outputs and identify hallucinations, regressions, and failure modes
  • Develop scripts, pipelines, and tools for data generation, scoring, and validation
  • Produce structured annotations, judgments, and high-quality datasets
  • Run systematic evaluations that help improve model reliability and reasoning
Required Skills
  • Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer) and strong hands‑on coding experience (professional or research-based) in one or more of:
  • Solid experience with Linux + Bash, scripting, and automation
  • Strong with Docker, reproducible environments, and dev containers
  • Advanced Git skills (branching, diffs, patches, conflict resolution)
  • Solid understanding of testing and QA (unit, integration, negative, edge‑case focused)
  • Ability to reliably overlap with 8am–12pm PT
Obtém a tua avaliação gratuita e confidencial do currículo.
ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.