Enable job alerts via email!

Software Engineer, Memory & Observability (Mid-Level)

JR United Kingdom

London

Remote

GBP 70,000 - 95,000

Full time

3 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading tech company is seeking a mid-level Software Engineer to enhance their memory and observability systems. The role involves designing Python micro-services, optimizing performance metrics, and ensuring scalability. With an attractive salary and commitment to professional growth, this opportunity is set for those eager to make significant contributions.

Benefits

Annual learning budget (£1,000)

GPU credits for side experiments

Meaningful equity

Qualifications

3-6 years professional software experience; comfortable owning production services.
Solid Python experience, familiar with FastAPI or Flask.
Working knowledge of machine-learning inference flows.

Responsibilities

Design and implement Python micro-services (FastAPI / asyncio).
Write clear, observable code with structured logging.
Optimize async pipelines and fix race conditions.

Skills

Python 3.10+

Concurrency literacy

Observability & scale

API routing & gateway patterns

Social network you want to login/join with:

Software Engineer, Memory & Observability (Mid-Level), london

col-narrow-left

Client:

Open Code Mission

Location:

london, United Kingdom

Job Category:

Other

EU work permit required:

Yes

col-narrow-right

Job Views:

Posted:

31.05.2025

Expiry Date:

15.07.2025

col-wide

Job Description:

Why Open Code Mission?

Open Code Mission buildsETERNALLY, a learning-augmented memory architecture that couples a durable JSON + FAISS Memory Core with surprise-aware Neural Memory and a Context Cascade Engine to let agents learn at test time. Our B2B dashboard exposes explainable diagnostics so security and product teams can trust what their AI is doing.

We’re a small, execution-driven team; you’ll ship code that lands in production within days, not quarters.

The Impact You’ll Have

In your first 6–12 months you will:

Harden concurrency pathsinside the Memory Core—e.g., finishingassembly_transactionlocking and vector-index repair loops—so we can scale from single-tenant pilots to multi-capsule production clusters.
Instrument end-to-end metrics(Prometheus + custom JSONL traces) across MC → NM → CCE so variant decisions and QuickRecal boosts surface in the dashboard with < 2 s latency.
Extend ourReact/Express dashboardwith new health, explainability, and live-log views, wiring them to the triple-nested API contract.
Addtest-time-learning features(e.g., MAG gate experiments) behind feature flags and run A/B evaluations with the research team.

What You’ll Do Day-to-Day

Design and implement Python micro-services (FastAPI / asyncio) that talk to FAISS, Redis, and TensorFlow.
Write clear, observable code—structured logging, Prometheus counters, Grafana alerts.
Optimize async pipelines, back-pressure, and retry queues; profile and fix race conditions.
Ship TypeScript/React features (tables, charts, WebSocket log streams) that consume ourselectData()hooks.
Review PRs with empathy; propose small RFCs for larger refactors.

Must-Haves

3-6 yearsprofessional software experience; comfortable owning production services.
SolidPython 3.10+: asyncio, typing, FastAPI (or Flask/Fastify-equivalent for JS).
Working knowledge ofmachine-learning inference flows: embeddings, vector search, or LLM APIs.
Concurrency literacy: async/await, task pools, locks; can explain when to pick threads vs processes vs async.
Observability & scale: you’ve plumbed Prometheus/Grafana (or OpenTelemetry) into high-QPS APIs and know what RED/USE means.
API routing & gateway patterns(reverse proxies, rate limiting, shrink-wrap error envelopes).
Comfortable in *nix, Docker(-Compose); can add a health check and iterate locally.

Nice-to-Haves

TensorFlow 2.x or PyTorch; have traced a gradient or two.
FAISS, Milvus or other ANN libraries.
Experience withReact + TanStack Query + Zustandor similar state stacks.
Basic familiarity with Kubernetes and GitHub Actions CI.
Interest in Explainable-AI, AI and traditional cyber security, and LLM governance.

Working Style

Remote-first(core hours 10:00-17:00 UTC).
Weekly engineering demo; lightweight RFC process; “you build it, you own it” on-call rota (one week every ~6).
Small, friendly code reviews focused on clarity and test coverage, not nit-picking variable names.

Compensation & Growth

Salary band£70,000 – £95,000 + meaningful equity(DOE & location).
Annual learning budget (£1,000).
GPU credits for side experiments.
Clear growth track to Senior Engineer: own a capsule-scale roll-out, mentor junior devs, and architect a new service.

Hiring Process (≈ 4 weeks)

For pre-qualified and vetted applicants there will be a 90 minute informal chatto assess culture & role fit.
Technical discussionwith a walk-through async/metrics design you’re proud of; no LeetCode).
Take-home task(build or instrument a tiny async API; ~3 hours, paid).
Offer & reference call.

Ready to build memory systems that can actuallylearnin production? ThenApply and include your GitHub or a project you’re proud of.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs