Enable job alerts via email!

Lead Data Scientist – Synthetic Systems

Populix

Jakarta Utara

On-site

IDR 200.000.000 - 300.000.000

Full time

Today
Be an early applicant

Job summary

A consumer insights platform is seeking a Lead Data Scientist to spearhead the development of simulation systems and automation pipelines. This role involves designing behavioral responses and collaborating with teams for impactful research outputs. The ideal candidate has over 5 years in data science, strong skills in generative modeling, and advanced knowledge of Python. Join us in building the future of AI-powered market research within a dynamic team environment in Jakarta Utara, Indonesia.

Qualifications

  • Master’s degree required, preferably in a quantitative field; PhD is a strong plus.
  • 5+ years of experience in data science or applied machine learning.
  • Deep experience in generative modeling with strong grounding in statistics.

Responsibilities

  • Lead the design of behavioral simulation responses using generative models.
  • Collaborate with research and marketing teams for simulation-driven whitepapers.
  • Drive automation of research workflows for open-ended responses and audio data.

Skills

Generative modeling
Statistical modeling
Python programming
Machine learning
Behavioral data modeling

Education

Master’s degree in Computer Science, Statistics, or Data Science

Tools

Python
PyTorch
scikit-learn
Job description

Populix is a consumer insights platform that helps businesses connect with its database of respondents and provides insights to understand Indonesian consumer preferences. Populix has a pool of over 1,000,000 respondents across Indonesia. Its products range from intensive research studies to simple surveys and can be arranged on a project or subscription basis. Focusing on Indonesian consumers being highly engaged with their mobile devices, Populix facilitates a diverse range of data collection methods via its mobile app.

About the Role

Populix is building the future of AI-powered market research, combining structured data, unstructured insights, and generative AI into a seamless research intelligence platform. We're looking for a Lead Data Scientist to help drive that vision forward, spearheading the development of simulation systems and automation pipelines, while supporting the Head of Data Science in shaping our AI research strategy.

This role will be at the forefront of building simulation modeling, scaling automation for text and audio-based survey data, and translating research into whitepapers that position Populix as a thought leader in the region. You\'ll also advance our use of retrieval-augmented generation (RAG) and modular AI architectures to deliver fast, accurate, contextualized insights.

Key Responsibilities
  • Lead the design and implementation of behavioral simulation responses and demographic patterns using generative models, statistical modeling, and controlled simulations.
  • Collaborate with the research and marketing teams to create simulation-driven whitepapers and internal studies, communicating the value of synthetic insight across use cases like campaign testing, segmentation, and hypothetical trends.
  • Drive automation of research workflows involving open-ended responses and audio data, including pipelines for transcription, classification, summarization, and sentiment analysis.
  • Work with the Head of Data Science to translate high-level product and research strategy into technical roadmaps, experiment plans, and model architecture decisions.
  • Help scale our AI insight engine by contributing to Retrieval-Augmented Generation (RAG) workflows and collaborating with LLM engineers on modular pipelines for context-rich output generation.
  • Collaborate with engineers, designers, and product teams to ship robust ML-powered tools into production across the Populix platform.
  • Provide mentorship to other data scientists, sharing knowledge, reviewing modeling work, and promoting a culture of experimentation, reproducibility, and ethical AI.
Required Qualifications
  • Master’s degree required, preferably in Computer Science, Statistics, Data Science, or related quantitative field; PhD is a strong plus.
  • 5+ years of experience in data science or applied machine learning, including at least 1 year in a technical leadership role.
  • Deep experience in generative modeling (e.g., GANs, VAEs), simulation, or behavioral data modeling, with strong grounding in statistics and hypothesis testing.
  • Hands-on experience with Retrieval-Augmented Generation (RAG) architectures and knowledge integration with LLMs.
  • Strong programming skills in Python and experience with tools like LangGraph, LangSmith, scikit-learn, PyTorch, Hugging Face, or equivalent frameworks.
  • Familiarity with both structured (e.g., survey data) and unstructured (e.g., audio, text) data workflows, including preprocessing, feature extraction, and integration into insight pipelines.
  • Experience in translating ideas into effective AI-driven solutions to real-world problems, with strong communication skills to convey modeling approaches and value.
Preferred Qualifications
  • Prior experience in market research, behavioral analytics, or social data modeling.
  • Exposure to speech processing, voice-to-text systems, and sentiment detection from audio or conversational data.
  • Knowledge of synthetic data generation ethics, validation strategies, and mixed-method evaluation.
  • Experience with cloud-based analytics environments and orchestration tools (e.g., BigQuery, Airflow, Kubeflow, MLflow).
  • Experience as an individual contributor.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.