About the role
We’re looking for a data scientist who’s equally passionate about machine learning research and software engineering to join our client’s privacy and synthetic data team.
Their mission is to build privacy-preserving data solutions that empower organisations to safely use and share data. One of their flagship products enables synthetic data generation (SDG) — a cutting‑edge privacy technology that allows users to generate realistic, privacy‑safe datasets for analytics and AI applications.
This cutting‑edge application is in a rapid growth phase with exciting new features in the pipeline, including API integrations and scalable model enhancements. You’ll play a key role in advancing their machine learning engine and integrating it into production‑ready systems used nationwide.
What You’ll Do
1. Model Development
- Design and run experiments to evaluate emerging SDG models (e.g. DDPM, ARF, Gaussian Copula).
- Investigate model performance across data types, sizes, and distributions.
- Tune hyperparameters, refine architectures, and propose new modeling approaches.
2. Feature & Product Development
- Collaborate with engineers to develop product features that rely on ML/DS capabilities (e.g. imputation, data constraints, preprocessing).
- Prototype and implement strategies for constraints handling, imputation, and privacy metrics.
3. Diagnostics & Debugging
- Work with users and engineers to diagnose and resolve model training issues, data integration challenges, and performance bottlenecks.
- Translate technical insights into clear, actionable solutions.
4. Documentation & Knowledge Sharing
- Create clear, user‑friendly documentation explaining model choices, metrics, and tuning parameters.
- Communicate complex ML and privacy concepts to non‑technical stakeholders.
5. Collaboration
- Partner closely with backend engineers (FastAPI, AWS) and frontend teams (Next.js/React) to integrate ML components into scalable systems.
- Participate in Agile ceremonies, code reviews, and technical design discussions.
Qualifications
- You should have a Bachelor’s degree or higher in Computer Science, Data Science, Business Analytics, or a related field.
- A minimum of 3 years of hands‑on experience in data science, machine learning, or applied research.
- You should have a strong foundation in ML concepts and model experimentation, along with a high level of proficiency in Python and common ML frameworks (PyTorch, TensorFlow, scikit‑learn).
- Any experiences in diagnosing training issues and improving model performance and building and deploying ML systems via REST APIs (FastAPI, Flask, etc.) will be highly regarded.
- The ability to read, synthesise, and apply emerging research to real‑world prototypes is crucial too.
- Understanding of Git, CI/CD, and collaborative software practices, and AWS or any other cloud environments will be important.
- You should be atrong communicator who can bridge technical and non‑technical worlds, as well as collaborative and comfortable in a fast‑paced Agile environment.
What’s in it for you?
- Work on cutting‑edge privacy and data technologies with real‑world impact.
- Collaborate with a team of engineers and scientists building innovative, large‑scale data systems.
- Be part of a mission‑driven organisation committed to ethical AI and responsible data use.
Interested?
Apply now and help shape the future of privacy‑preserving data science.