Aktiviere Job-Benachrichtigungen per E-Mail!

Data Engineer - Applied ML & Distributed Compute (m/f/d)

ECDB

Hamburg

Hybrid

EUR 60.000 - 80.000

Vollzeit

Heute

Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A leading eCommerce data platform provider in Hamburg is seeking an experienced data engineer to develop and optimize data processing pipelines. The role requires strong experience in Python and distributed computing frameworks like Spark and Dask. The successful candidate will work on training and deploying machine learning models at scale within an ambitious team. Attractive benefits include flexible working hours, a modern office, and a strong focus on personal development.

Leistungen

Attractive career opportunities

Flexible working hours

Focus on continuous learning

Qualifikationen

4+ years of relevant professional experience in data processing.
Proven experience with Python and distributed compute frameworks.
Practical experience in training, evaluation, and deployment of ML models.

Aufgaben

Own large-scale data processing pipelines, including batch processing.
Design and optimize distributed compute workloads.
Train, deploy and monitor ML models at scale.

Kenntnisse

Data processing

Python

Machine Learning

Distributed computing

Tools

Spark

Dask

Ray

About ECDB

ECDB – Shaping the Future of eCommerce with Data!

At ECDB, we firmly believe that data determines success in eCommerce. That’s why we provide leading companies like Amazon, Google, and PayPal with the most precise analyses and market insights. With billions of transactions as our foundation, we are developing one of the most comprehensive eCommerce data platforms worldwide. Our team of over 50 experts combines cutting-edge technology with deep industry knowledge – and this is where you come in! If you're eager to shape the future of eCommerce through data-driven insights, ECDB is the perfect place for you.

Tasks

Own large-scale data processing pipelines, including batch processing of raw, unstructured data
Design and optimize distributed compute workloads to transform large-scale web and natural language data into structured, production-ready datasets
Train, deploy and monitor ML-models at scale (e.g., NLP models, classifiers and enrichment use-cases)
Productionize models: Batch inference & retraining pipelines
Implement AI-assisted pipelines (e.g. LLM-based classification or extraction)

Requirements

Several years of relevant professional experience (4+ years)
Proven track record in python-heavy data processing
Prior experience with distributed compute frameworks (Spark / Dask / Ray) on object-storage based datasets (e.g., Parquet on S3-compatible storage)
Practical ML experience (training, evaluation, deployment, retraining)
Ability to work with messy, large-scale data and turn it into reliable outputs

Benefits

Attractive career opportunities in a rapidly growing company
Short decision-making processes and plenty of room for personal responsibility
An ambitious, open-minded team with a passion for smart solutions
A strong focus on continuous learning and development
Flexible working hours, the option to work from home, and a healthy work–life balance
A modern office in Hamburg’s historic Speicherstadt, offering a unique atmosphere

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Standorte

Top-Unternehmen

Top-Positionen