Enable job alerts via email!

Software Engineer, Memory & Observability (Mid-Level)

JR United Kingdom

Slough

Remote

GBP 70,000 - 95,000

Full time

4 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in Slough is looking for a Mid-Level Software Engineer specializing in memory architecture and observability. You will be involved in enhancing concurrency paths and implementing Python micro-services. The role offers a competitive salary, a structured growth path, and the opportunity to work in a remote-first environment focused on innovative AI solutions.

Benefits

Annual learning budget (£1,000)
GPU credits for side experiments
Meaningful equity

Qualifications

  • 3-6 years professional software experience; comfortable owning production services.
  • Solid Python skills including asyncio and FastAPI.
  • Experience with machine-learning inference flows and observability tools like Prometheus.

Responsibilities

  • Design and implement Python micro-services using FastAPI.
  • Optimize async pipelines and profile/race conditions.
  • Extend React/Express dashboard features.

Skills

Python 3.10+
Concurrency literacy
Observability & scale
API routing & gateway patterns

Job description

Social network you want to login/join with:

Software Engineer, Memory & Observability (Mid-Level), Slough

Client: Open Code Mission

Location: Slough, United Kingdom

Job Category: Other

EU work permit required: Yes

Job Views:

4

Posted:

31.05.2025

Expiry Date:

15.07.2025

Job Description:

Why Open Code Mission?

Open Code Mission builds ETERNALLY, a learning-augmented memory architecture that couples a durable JSON + FAISS Memory Core with surprise-aware Neural Memory and a Context Cascade Engine to let agents learn at test time. Our B2B dashboard exposes explainable diagnostics so security and product teams can trust what their AI is doing.

We’re a small, execution-driven team; you’ll ship code that lands in production within days, not quarters.

The Impact You’ll Have

In your first 6–12 months you will:

  • Harden concurrency paths inside the Memory Core—e.g., finishing assembly_transaction locking and vector-index repair loops—so we can scale from single-tenant pilots to multi-capsule production clusters.
  • Instrument end-to-end metrics (Prometheus + custom JSONL traces) across MC → NM → CCE so variant decisions and QuickRecal boosts surface in the dashboard with < 2 s latency.
  • Extend our React/Express dashboard with new health, explainability, and live-log views, wiring them to the triple-nested API contract.
  • Add test-time-learning features (e.g., MAG gate experiments) behind feature flags and run A/B evaluations with the research team.
What You’ll Do Day-to-Day
  • Design and implement Python micro-services (FastAPI / asyncio) that talk to FAISS, Redis, and TensorFlow.
  • Write clear, observable code—structured logging, Prometheus counters, Grafana alerts.
  • Optimize async pipelines, back-pressure, and retry queues; profile and fix race conditions.
  • Ship TypeScript/React features (tables, charts, WebSocket log streams) that consume our selectData() hooks.
  • Review PRs with empathy; propose small RFCs for larger refactors.
Must-Haves
  • 3-6 years professional software experience; comfortable owning production services.
  • Solid Python 3.10+: asyncio, typing, FastAPI (or Flask/Fastify-equivalent for JS).
  • Working knowledge of machine-learning inference flows: embeddings, vector search, or LLM APIs.
  • Concurrency literacy: async/await, task pools, locks; can explain when to pick threads vs processes vs async.
  • Observability & scale: you’ve plumbed Prometheus/Grafana (or OpenTelemetry) into high-QPS APIs and know what RED/USE means.
  • API routing & gateway patterns (reverse proxies, rate limiting, error handling).
  • Comfortable in *nix, Docker(-Compose); can add a health check and iterate locally.
Nice-to-Haves
  • TensorFlow 2.x or PyTorch; have traced a gradient or two.
  • FAISS, Milvus or other ANN libraries.
  • Experience with React + TanStack Query + Zustand or similar state stacks.
  • Basic familiarity with Kubernetes and GitHub Actions CI.
  • Interest in Explainable-AI, AI and traditional cyber security, and LLM governance.
Working Style
  • Remote-first (core hours 10:00-17:00 UTC).
  • Weekly engineering demo; lightweight RFC process; “you build it, you own it” on-call rota (one week every ~6).
  • Small, friendly code reviews focused on clarity and test coverage, not nit-picking variable names.
Compensation & Growth
  • Salary band £70,000 – £95,000 + meaningful equity (DOE & location).
  • Annual learning budget (£1,000).
  • GPU credits for side experiments.
  • Clear growth track to Senior Engineer: own a capsule-scale roll-out, mentor junior devs, and architect a new service.
Hiring Process (≈ 4 weeks)
  • For pre-qualified and vetted applicants there will be a 90-minute informal chat to assess culture & role fit.
  • Technical discussion with a walk-through async/metrics design you’re proud of; no LeetCode).
  • Take-home task (build or instrument a tiny async API; ~3 hours, paid).
  • Offer & reference call.

Ready to build memory systems that can actually learn in production? Then apply and include your GitHub or a project you’re proud of.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Engineer, Memory & Observability (Mid-Level)

JR United Kingdom

London

Remote

GBP 70,000 - 95,000

3 days ago
Be an early applicant