Join us at DNL and help shape the backend of a truly developer-focused platform. We’re building Germany's leading AI-powered tools that analyse financial and non-financial reports. Using state-of-the-art machine learning technologies, we extract key figures from annual financial statements - fully automatically and transparently. We are hiring a builder who can design, ship, and run production LLM systems under real constraints.
TASKS - Short Term: Deliver Reliable LLM Systems (0–3 months)
- Stabilize and optimize our core LLM-based features for performance, reliability, and cost efficiency.
- Take ownership of RAG pipelines — chunking, retrieval, reranking, and attribution — ensuring that every answer is traceable, verifiable, and hallucination-resistant. Improve serving infrastructure for low-latency and scalable inference (vLLM/TGI, batching, caching, observability).
- Collaborate with Backend, Product, and QA to ship robust features into production and collect early feedback from real auditors.
TASKS - Mid-term: Elevate Quality, Evaluation & Orchestration (3–6 months)
- Build a systematic evaluation layer for offline and online quality tracking: golden sets, regression testing, human-in-the-loop red-teaming.
- Introduce clear metrics for groundedness, coverage, and faithfulness — and make them visible through dashboards and reports.
- Design context and prompt management systems with versioning, deterministic testing, and safety fallbacks. Collaborate with leadership to define LLMOps best practices — CI/CD for prompts and models, automated deployment pipelines, and clear SLOs on latency, accuracy, and cost.
TASKS - Long-term: Define the Future of LLM Infrastructure (6–12 months)
- Architect the next generation of retrieval and reasoning systems for complex financial and ESG documents.
- Drive the vision for LLM orchestration — structured multi-turn flows, memory, and tool use that scale across product lines.
- Mentor other engineers and data scientists in applied LLM engineering and evaluation methodology.
- Contribute to open standards and tooling that make enterprise AI explainable and auditable.
- Work closely with company leadership to align long-term AI strategy with product and market goals.
REQUIREMENTS - Must Haves
- Shipped LLM systems to production — with real users, uptime, and feedback loops.
- Deep RAG experience — vector stores, hybrid lexical + dense retrieval, reranking, and source attribution.
- LLMOps at scale — Kubernetes, GPUs, vLLM or TGI, batching & caching, CI/CD for models and prompts, with metrics and tracing you actually look at.
- Evaluation mindset — dataset design, golden queries, offline & online metrics, and human-in-the-loop QA where it truly matters.
- Orchestration mastery — multi-turn flows, memory, tool use, and the judgment to go custom when frameworks get in the way.
- Strong engineering fundamentals — Python, FastAPI, clean APIs, large text pipelines, Postgres, Redis, vector DBs.
- Clear communication in English; German is a plus.
REQUIREMENTS - Nice to Haves
- Finance / audit exposure — annual reports, notes, XBRL, ESRS.
- Retrieval depth — Vespa or Elastic kNN, ColBERT or SPLADE, BM25 + dense hybrid retrieval, reranking at scale.
- Performance optimization — quantization, tensor parallelism, Triton kernels, flash attention, Ray Serve.
- Tooling familiarity — MLflow or W&B, Kafka, pgvector, Milvus, Weaviate, Qdrant.
OUR SETUP - How We Build
- Product over paperwork — We ship fast, test in production, and learn by doing.
- Pilots, not passengers — Everyone codes, reviews, and deploys.
- Small, senior, autonomous team — You’ll have real scope, accountability, and impact.
OUR SETUP - Our Stack Today
- Infrastructure — Kubernetes, GPUs, Postgres, Redis, object storage, Grafana + Prometheus, GitHub Actions.
- Model & Serving — PyTorch, Hugging Face, vLLM / TGI, SKLearn, FastText.
- Application Layer — Python, FastAPI, vector DBs, Phoenix.
- Ops & Monitoring — MLflow / W&B, full tracing and dashboards.
- Model policy — We use open weights or APIs based on reliability, cost, and data sensitivity.
BENEFITS
- Above-Average Compensation: We offer a competitive salary above market average, reflecting the impact and expertise we value as well as meaningful equity.
- Monthly Perks (Germany-based): If you're employed in Germany, you’ll receive a €50 monthly voucher usable at over 50 popular stores—covering everything from groceries to lifestyle.
- Learning budget: To foster your professional development, we provide financial support for conferences and continuing education courses.
- Innovative Work Culture: A collaborative startup environment with flat hierarchies, fast decisions, and space for your ideas.
- Great People: Work alongside an international team of passionate and driven professionals.
- Time Off: 30 vacation days per year to recharge and explore.
- Remote Flexibility: Work from anywhere within Europe and participate in optional Berlin meetups.
- Flexible Hours: Adapt your schedule to your personal rhythm and lifestyle.
- Top Equipment: We’ll provide you with the latest hardware to do your best work.
We are looking forward to your application and getting to know you!