Enable job alerts via email!
Boost your interview chances
Create a job specific, tailored resume for higher success rate.
A leading company in AI is seeking engineers passionate about large language models to develop SDKs and runtimes for autonomous agents. You'll work with elite peers on cutting-edge tech, ensuring agents operate reliably across various business tools. This role offers significant impact and collaboration opportunities in a fast-paced environment.
Context is turning large language models into autonomous co‑workers. Our productivity suite lets AI plan, reason, and execute across 300+ business tools. We've raised $11M (Seed, Lux Capital) at a $70M valuation and ship GA this June. We're hiring engineers who live and breathe LLMs—building the SDKs, runtimes, and eval harnesses that make LLM agents reliable at scale.
If you sketch agent flows in Excalidraw, obsess over token‑level traces, etc., let's talk.
Impact: Frontier playground
50M‑token memory → agents with persistent cognition well beyond frontier models.
Full‑stack autonomy: Agents open docs, write slides, hit APIs, and re‑plan—no brittle RPA hacks.
Tight feedback loops: Beta fleet drives >5K agent runs/day; your PRs hit prod within hours.
Elite peers: Microsoft Research, Stanford, Tesla peers—zero bureaucracy.
Hardware ally: Qualcomm SnapdragonXElite partnership: 10× lower latency, on‑device privacy. Gateway into enterprise & government.
Agent SDK (TypeScript) — a composable toolkit (task graph, tools, memory adapters) powering both server‑side and edge runtimes.
Swarm Runtime — orchestrator in Node.js/Workers that schedules, supervises, and retries agent sub‑tasks at millisecond granularity.
Graph Traversal APIs — ergonomic wrappers around our 50M‑token knowledge graph with streaming, partial updates, and vector fallbacks.
Adaptive Prompt Compiler — real‑time TS library that compiles structured inputs → model‑specific prompts and parses responses.
Evaluation Harness — Jest/Playwright‑style test runner that auto‑red‑teams agents for hallucinations, tool‑use failure, and latency regressions.
TypeScript • Node.js (18+) • Bun • Cloudflare Workers / V8 isolate
LangChainJS, LlamaIndex TS, custom lightweight frameworks
OpenAI, Anthropic, Llama CPP bindings
Rust (via NAPI) for hot paths • WebGPU/WebAssembly for client‑side reasoning
Postgres • Redis • Apache Arrow • bespoke graph store
Kubernetes • GCP • Edge functions
You've shipped production TS/Node services at scale and love type‑level safety.
You've built or contributed to agent / retrieval frameworks (LangChainJS, AutoGen, custom) and can dissect their failure modes.
You treat prompts as code: unit‑test them, version them, and lint them.
Comfortable jumping into Rust/Wasm when the profiler says so.
You think debuggers, trace viewers, and live‑reload matter as much as model choice.
Authored OSS in the LLM tooling ecosystem (LangChainJS plugin, OpenAI helper lib, etc.).
Experience with CRDT/OT for collaborative agent state or live‑share UIs.
Prior work on distributed schedulers or graph databases.
Top‑of‑band salary + founder‑level equity
Medical, dental, vision • 401(k) match
Custom rig & conference budget
Bi‑annual team off‑sites (Tahoe powder, NYC AI Week, etc.)
Email team@context.inc with a short intro and links (repos, demos, papers). We reply within 48 hours.