Enable job alerts via email!

Software Engineer, Memory & Observability (Mid-Level)

JR United Kingdom

Slough

Remote

GBP 70,000 - 95,000

Full time

4 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in Slough is looking for a Mid-Level Software Engineer specializing in memory architecture and observability. You will be involved in enhancing concurrency paths and implementing Python micro-services. The role offers a competitive salary, a structured growth path, and the opportunity to work in a remote-first environment focused on innovative AI solutions.

Benefits

Annual learning budget (£1,000)

GPU credits for side experiments

Meaningful equity

Qualifications

3-6 years professional software experience; comfortable owning production services.
Solid Python skills including asyncio and FastAPI.
Experience with machine-learning inference flows and observability tools like Prometheus.

Responsibilities

Design and implement Python micro-services using FastAPI.
Optimize async pipelines and profile/race conditions.
Extend React/Express dashboard features.

Skills

Python 3.10+

Concurrency literacy

Observability & scale

API routing & gateway patterns

Social network you want to login/join with:

Software Engineer, Memory & Observability (Mid-Level), Slough

Client: Open Code Mission

Location: Slough, United Kingdom

Job Category: Other

EU work permit required: Yes

Job Views:

Posted:

31.05.2025

Expiry Date:

15.07.2025

Job Description:

Why Open Code Mission?

Open Code Mission builds ETERNALLY, a learning-augmented memory architecture that couples a durable JSON + FAISS Memory Core with surprise-aware Neural Memory and a Context Cascade Engine to let agents learn at test time. Our B2B dashboard exposes explainable diagnostics so security and product teams can trust what their AI is doing.

We’re a small, execution-driven team; you’ll ship code that lands in production within days, not quarters.

The Impact You’ll Have

In your first 6–12 months you will:

Harden concurrency paths inside the Memory Core—e.g., finishing assembly_transaction locking and vector-index repair loops—so we can scale from single-tenant pilots to multi-capsule production clusters.
Instrument end-to-end metrics (Prometheus + custom JSONL traces) across MC → NM → CCE so variant decisions and QuickRecal boosts surface in the dashboard with < 2 s latency.
Extend our React/Express dashboard with new health, explainability, and live-log views, wiring them to the triple-nested API contract.
Add test-time-learning features (e.g., MAG gate experiments) behind feature flags and run A/B evaluations with the research team.

What You’ll Do Day-to-Day

Design and implement Python micro-services (FastAPI / asyncio) that talk to FAISS, Redis, and TensorFlow.
Write clear, observable code—structured logging, Prometheus counters, Grafana alerts.
Optimize async pipelines, back-pressure, and retry queues; profile and fix race conditions.
Ship TypeScript/React features (tables, charts, WebSocket log streams) that consume our selectData() hooks.
Review PRs with empathy; propose small RFCs for larger refactors.

Must-Haves

3-6 years professional software experience; comfortable owning production services.
Solid Python 3.10+: asyncio, typing, FastAPI (or Flask/Fastify-equivalent for JS).
Working knowledge of machine-learning inference flows: embeddings, vector search, or LLM APIs.
Concurrency literacy: async/await, task pools, locks; can explain when to pick threads vs processes vs async.
Observability & scale: you’ve plumbed Prometheus/Grafana (or OpenTelemetry) into high-QPS APIs and know what RED/USE means.
API routing & gateway patterns (reverse proxies, rate limiting, error handling).
Comfortable in *nix, Docker(-Compose); can add a health check and iterate locally.

Nice-to-Haves

TensorFlow 2.x or PyTorch; have traced a gradient or two.
FAISS, Milvus or other ANN libraries.
Experience with React + TanStack Query + Zustand or similar state stacks.
Basic familiarity with Kubernetes and GitHub Actions CI.
Interest in Explainable-AI, AI and traditional cyber security, and LLM governance.

Working Style

Remote-first (core hours 10:00-17:00 UTC).
Weekly engineering demo; lightweight RFC process; “you build it, you own it” on-call rota (one week every ~6).
Small, friendly code reviews focused on clarity and test coverage, not nit-picking variable names.

Compensation & Growth

Salary band £70,000 – £95,000 + meaningful equity (DOE & location).
Annual learning budget (£1,000).
GPU credits for side experiments.
Clear growth track to Senior Engineer: own a capsule-scale roll-out, mentor junior devs, and architect a new service.

Hiring Process (≈ 4 weeks)

For pre-qualified and vetted applicants there will be a 90-minute informal chat to assess culture & role fit.
Technical discussion with a walk-through async/metrics design you’re proud of; no LeetCode).
Take-home task (build or instrument a tiny async API; ~3 hours, paid).
Offer & reference call.

Ready to build memory systems that can actually learn in production? Then apply and include your GitHub or a project you’re proud of.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs