Job Search and Career Advice Platform

Aktiviere Job-Benachrichtigungen per E-Mail!

Applied LLM / Data Scientist

Synagen AI

Berlin

Vor Ort

EUR 70.000 - 95.000

Vollzeit

Vor 2 Tagen
Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A leading AI healthcare solutions provider is seeking a talented professional to develop specialized AI agents that support clinical decisions in oncology. In this role, you will analyze data, build scalable pipelines, and ensure compliance in handling sensitive healthcare data. The ideal candidate has extensive experience in data science, ML engineering, and coding with Python and SQL. Strong English communication skills are essential for collaboration with clinical partners and stakeholders.

Qualifikationen

  • Strong experience in applied data science or ML engineering.
  • Proven ability to build production-grade pipelines.
  • Experience with LLM/agent systems in production.
  • Strong coding skills in Python and SQL.
  • Fluent in English, both written and spoken.

Aufgaben

  • Deliver analyses and data products for partner hospitals.
  • Build scalable pipelines to transform raw clinical data.
  • Lead research projects with pharma partners.
  • Design a datalake approach on Azure.
  • Ensure security and compliance with healthcare data.

Kenntnisse

Data science
ML engineering
MLOps
Python
SQL
ETL/ELT
Clinical terminologies
AI-assisted programming
Data visualization

Tools

Azure
DBT
Dagster
Jobbeschreibung
Overview

Synagen builds specialized AI agents for healthcare and oncology, designed to support complex clinical decisions and biomedical workflows with actionable, high-precision outputs. We combine modern AI with clinical expertise to create software that integrates into real provider environments and delivers value in practice.

Responsibilities
  • This role bridges two modes (split may vary over time):
  • Customer project work: deliver concrete analyses, data products, and insight pipelines for partner hospitals and projects.
  • Internal platform work: build the reusable foundations (datalake/lakehouse, ontology/terminology layer, evaluation/monitoring) that make those projects fast, reproducible, and production-grade.
  • Lead applied research / analytics projects with pharma and clinical partners: independently scope questions, define datasets and success criteria, and deliver end-to-end outputs with medical stakeholders.
  • Build and operate scalable pipelines that transform raw clinical/patient data into structured, queryable, analysis-ready datasets.
  • Design and evolve a datalake / lakehouse approach on Azure (storage, compute patterns, governance, access controls).
  • Develop and maintain ontologies / terminology mappings and a consistent internal data model to enable reliable downstream analytics and agent reasoning.
  • Build “SynInsight”-style data products for partners (e.g., cohorts, endpoints, phenotypes, evidence-ready exports and reports) that are robust, reproducible, and measurable.
  • Implement LLM/agent operations: prompt/workflow versioning, evaluation harnesses, monitoring, regression testing, and cost/performance controls—using AI-assisted development tools where helpful.
  • Build agents that automate R&D workflows (e.g., data-to-cohort pipelines, evidence synthesis, structured insight generation), and operationalize them with proper evaluation and monitoring.
  • Drive privacy-preserving data capabilities, including synthetic data generation for development, evaluation, and safer sharing/testing in projects (including Azure-based implementations).
  • Ensure security, privacy, and compliance expectations are met when processing sensitive healthcare data in Germany/EU and the US (e.g., GDPR, ISO 27001, SOC 2, BSI C5; US healthcare compliance alignment).
Qualifications
  • Strong experience in applied data science / ML engineering / MLOps, ideally in pharma, R&D, or healthcare-adjacent environments.
  • Proven ability to build production-grade pipelines for messy real-world data (ETL/ELT, data quality, lineage, reproducibility).
  • Experience building and operating LLM/agent systems in production (workflows, evaluation, monitoring, reliability).
  • Strong coding skills (Python + SQL) and comfort with engineering best practices (tests, CI/CD, documentation).
  • Practical experience structuring data with ontologies/terminologies and making it usable for analytics and downstream systems.
  • Experience in AI-assisted programming (Claude Code, Codex, etc.)
  • Fluent in English (written and spoken).
Good to have
  • Experience with clinical terminologies and standards (e.g., ICD-10, SNOMED CT, LOINC, RxNorm/ATC).
  • Experience with modern data stack components (lakehouse patterns, columnar formats, distributed compute) on Azure.
  • Familiarity with privacy-preserving data processing (pseudonymization/de-identification, access partitioning, audit trails).
  • Experience delivering customer-facing data/ML projects end-to-end.
  • Experience with modern DataOps tooling for reproducible data and agent workflows (e.g. dbt, Dagster, or similar asset-based orchestration and transformation frameworks).

Real-world impact in oncology: build integrations that bring AI into clinical workflows where accuracy and trust matter. High ownership: you will shape our interoperability layer end-to-end and define how we integrate at scale.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.