Job Search and Career Advice Platform

Aktiviere Job-Benachrichtigungen per E-Mail!

Applied LLM / Data Scientist

Synagen GmbH

Berlin

Vor Ort

Vertraulich

Vollzeit

Vor 2 Tagen
Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

Ein führendes Unternehmen im Gesundheitsbereich in Berlin sucht einen Applied LLM / Data Scientist, um hochvolumige Patientendaten in wissenschaftliche Erkenntnisse zu verwandeln. Sie werden für die Erstellung und den Betrieb skalierbarer Datenpipelines verantwortlich sein und dabei mit pharmazeutischen Partnern zusammenarbeiten. Starke Kenntnisse in Datenwissenschaft und ML-Engineering, insbesondere in Python und SQL, sind erforderlich. Die Rolle bietet die Möglichkeit, echten Einfluss auf die klinischen Arbeitsabläufe zu haben und innovative Lösungen zu entwickeln.

Leistungen

Flexible Arbeitszeiten
Möglichkeiten zur beruflichen Weiterbildung
Innovatives Arbeitsumfeld

Qualifikationen

  • Starke Erfahrung in angewandter Datenwissenschaft und ML-Engineering, vorzugsweise im Pharma- oder Gesundheitsbereich.
  • Nachweisliche Fähigkeit, produktionsreife Pipelines für reale Daten zu erstellen.
  • Erfahrung im Umgang mit LLM/Agentensystemen in der Produktion.

Aufgaben

  • Leiten von angewandten Forschungs- und Analyseprojekten.
  • Bauen und Betreiben skalierbarer Pipelines.
  • Entwickeln und Pflegen von Ontologien und konsistenten internen Datenmodellen.

Kenntnisse

Datenwissenschaft
ML Engineering
MLOps
Python
SQL
AI-assisted programming
Fließendes Englisch

Tools

Azure
Claude Code
Codex
Jobbeschreibung

Synagen builds specialized AI agents for healthcare and oncology, designed to support complex clinical decisions and biomedical workflows with actionable, high-precision outputs. We combine modern AI with clinical expertise to create software that integrates into real provider environments and delivers value in practice.

Aufgaben

Synagen builds AI agents for healthcare and oncology that integrate into real clinical workflows. As our Applied LLM / Data Scientist, you will help turn high-volume patient and clinical data into scientific, research, and clinical insights—by building the data and model operations layer that makes this reliable, scalable, and compliant. A core part of this role applied research and analytics delivery with pharma partners: you will work customer-near on real projects, translate scientific questions into data products and agent workflows, and ship outcomes that can be used in practice.

  • This role bridges two modes (split may vary over time):
  • Customer project work: deliver concrete analyses, data products, and insight pipelines for partner hospitals and projects.
  • Internal platform work: build the reusable foundations (datalake/lakehouse, ontology/terminology layer, evaluation/monitoring) that make those projects fast, reproducible, and production-grade.

What you will do

  • Lead applied research / analytics projects with pharma and clinical partners: independently scope questions, define datasets and success criteria, and deliver end-to-end outputs with medical stakeholders.
  • Build and operate scalable pipelines that transform raw clinical/patient data into structured, queryable, analysis-ready datasets.
  • Design and evolve a datalake / lakehouse approach on Azure (storage, compute patterns, governance, access controls).
  • Develop and maintain ontologies / terminology mappings and a consistent internal data model to enable reliable downstream analytics and agent reasoning.
  • Build “SynInsight”-style data products for partners (e.g., cohorts, endpoints, phenotypes, evidence-ready exports and reports) that are robust, reproducible, and measurable.
  • Implement LLM/agent operations: prompt/workflow versioning, evaluation harnesses, monitoring, regression testing, and cost/performance controls—using AI-assisted development tools (e.g., Claude Code, Codex) where helpful.
  • Build agents that automate R&D workflows (e.g., data-to-cohort pipelines, evidence synthesis, structured insight generation), andoperationalize them with proper evaluation and monitoring.
  • Drive privacy-preserving data capabilities, including synthetic data generation for development, evaluation, and safer sharing/testing in projects (including Azure-based implementations).
  • Ensure security, privacy, and compliance expectations are met when processing sensitive healthcare data in Germany/EU and the US (e.g., GDPR, ISO 27001, SOC 2, BSI C5; US healthcare compliance alignment).
Qualifikation
  • Strong experience in applied data science / ML engineering / MLOps, ideally in pharma, R&D, or healthcare-adjacent environments.
  • Proven ability to build production-grade pipelines for messy real-world data (ETL/ELT, data quality, lineage, reproducibility).
  • Experience building and operating LLM/agent systems in production (workflows, evaluation, monitoring, reliability).
  • Strong coding skills (Python + SQL) and comfort with engineering best practices (tests, CI/CD, documentation).
  • Practical experience structuring data with ontologies/terminologies and making it usable for analytics and downstream systems.
  • Experience in AI-assisted programming (Claude Code, Codex, etc.)
  • Fluent in English (written and spoken).

Good to have

  • Experience with clinical terminologies and standards (e.g., ICD-10, SNOMED CT, LOINC, RxNorm/ATC).
  • Experience with modern data stack components (lakehouse patterns, columnar formats, distributed compute) on Azure.
  • Familiarity with privacy-preserving data processing (pseudonymization/de-identification, access partitioning, audit trails).
  • Experience delivering customer-facing data/ML projects end-to-end.
  • Experience with modern DataOps tooling for reproducible data and agent workflows (e.g. dbt, Dagster, or similar asset-based orchestration and transformation frameworks).

Real-world impact in oncology: build integrations that bring AI into clinical workflows where accuracy and trust matter. High ownership: you will shape our interoperability layer end-to-end and define how we integrate at scale.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.