Job Scope:
This role will be at the intersection of data science, applied machine learning, and software engineering.
You will be involved in:
1. Model Development
- Design and conduct experiments to evaluate emerging SDG models (e.g.,
DDPM, ARF, Gaussian Copula).
- Investigate failure cases (e.g., when models fail with certain data types, size, or cardinality).
- Tune hyperparameters, refine architectures, and propose new modeling
strategies.
2. Feature & Product Development
- Collaborate with software engineers to build product features that require ML/DS input (e.g., imputation methods, handling of constraints, preprocessing pipelines).
- Recommend and develop suitable approaches for features like
single-/multi-column constraints, imputation strategies, and privacy metrics.
3. Diagnostics & Debugging
- Work directly with users and the engineering team to diagnose user issues with training failures, poor outputs, or integration challenges.
- Provide actionable fixes and communicate technical insights in a user-friendly
way.
4. Documentation & Knowledge Sharing
- Write user-facing documentation pages. This could include explaining model choice, hyperparameters, and utility/privacy metrics in a user-friendly manner.
- Translate complex technical Data Science concepts into clear, approachable explanations.
5. Collaboration
- Work closely with the SWE team (Next.js, FastAPI, AWS) to integrate the
generation engine into production-ready systems.
- Participate in Agile rituals, code reviews, and design discussions.
Requirements:
1. Bachelors degree or higher in Computer Science, Data Science, Business Analytics or related field, with at least 2-3 years of relevant professional experience.
2. Core Data Science & ML skillset
- Strong foundation in machine learning, with hands-on experience in model development and experimentation.
- Strong programming proficiency in Python and experience with ML frameworks (e.g., PyTorch, TensorFlow, scikit-learn).
- Ability to analyze model behavior, diagnose training issues, and design
experiments to improve performance.
3. Applied Research & Experimentation
- Familiarity with reading, synthesizing, and ability to translate emerging research into practical prototypes
4. Software Engineering
- Working knowledge of backend development (REST APIs, FastAPI, Flask, or similar).
- Comfortable working with cloud environments (AWS preferred).
- Ability to debug and fix software-level issues when they affect ML workflows.
- Familiarity with Git, CI/CD, and collaborative coding best practices
5. Nice-to-Haves
- Experience with privacy-enhancing technologies, anonymisation, synthetic data generation or differential privacy.
- Familiarity with frontend integration workflows (Next.js/React).
Prior experience working in multi-disciplinary product teams.
6. Mindset & Collaboration
- Curiosity and willingness to learn new domains (esp. data privacy).
- Strong communication skills to explain technical concepts to both engineers and non-technical stakeholders.
- Inclination to work in a collaborative, fast-moving Agile environment