Enable job alerts via email!

Research Assistant (Cancer Science Institute)

National University of Singapore

Singapore

On-site

SGD 70,000 - 90,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading educational institution in Singapore is seeking a skilled Big Data Engineer to join their Genomics and Data Analytics Core. This role involves architecting and maintaining data ingestion processes, managing and deploying AWS resources with Infrastructure as Code, and implementing data governance standards. The ideal candidate will have proven experience in data engineering and a strong proficiency in Python and AWS. This position offers the opportunity to work in a collaborative technical environment focused on data-driven research.

Qualifications

2+ years of experience in Data Engineering, Backend Development, or DevOps.
Demonstrable experience working with commercial cloud infrastructure (AWS preferred).

Responsibilities

Architect and maintain automation for ingesting raw data.
Manage and deploy AWS resources using AWS CloudFormation.
Implement technical controls for data governance.
Manage data transfer between on-premise HPC and AWS services.
Ensure data is staged for processing pipelines.

Skills

Strong proficiency in Python

Experience with Infrastructure as Code (IaC)

Proficiency with SQL

Experience in Linux/Unix environments

Education

Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field

Tools

AWS CloudFormation

Terraform

PostgreSQL/Aurora

Kubernetes

Company description

The National University of Singapore is the national research university of Singapore. Founded in 1905 as the Straits Settlements and the Federated Malay States Government Medical School, NUS is the oldest higher education institution in Singapore

Job Description

The Cancer Science Institute of Singapore - a part of National University of Singapore - is seeking a skilled Big Data Engineer to join the Genomics and Data Analytics Core (GeDaC). We are operating a petabyte-scale "Data Nexus" that serves as the foundation for a production AI Factory in cancer and human disease research.

You do not need a background in biology. We are looking for a pure engineer who understands data logistics, infrastructure, and scale.

The Team & Leadership

You will join a highly specialized technical team comprising an experienced Cloud/HPC Architect, an agile Full-Stack Developer, and a senior IT Manager. Crucially, you will report to a Facility Head with deep, hands‑on expertise in petabyte-scale data‑intensive computing and DataOps. This ensures you will work in an environment where technical complexity is understood, architectural decisions are respected, and job scope is managed with engineering reality in mind.

Key Responsibilities

Data Ingestion & Logistics: Architect and maintain robust automation for ingesting raw data from sequencing instruments to our hybrid storage systems. You will own the "handshakes" that ensure data moves reliably from edge to cloud.
Infrastructure as Code (IaC): Manage and deploy AWS resources (S3, Lambda, DynamoDB, RDS) using AWS CloudFormation, ensuring our infrastructure is reproducible, version‑controlled, and follows DevSecOps best practices.
Technical Compliance & Provenance: Implement the technical controls for data governance. This includes designing immutable audit logs, automated access control policies, and lineage tracking systems to satisfy regulatory requirements (no manual report writing required).
Hybrid Cloud Synchronization: Manage the lifecycle of data moving between on‑premise HPC and AWS S3 Intelligent‑Tiering/Glacier to balance high‑performance availability with long‑term cost optimization.
Pipeline Integration: Work closely with the Senior HPC Engineer to ensure data is correctly staged for Nextflow/Kubernetes processing pipelines, and capture the outputs back into the data lake/warehouse.
Database Management: Maintain the SQL and NoSQL databases that serve as the "source of truth" for file metadata, ensuring the Full‑Stack team has low‑latency API access to query file status.

Requirements

Education: Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field.

Experience:

2+ years of experience in Data Engineering, Backend Development, or DevOps.
Demonstrable experience working with commercial cloud infrastructure (AWS preferred).

Technical Skills:

Core Logic: Strong proficiency in Python (data tooling, automation scripts).
Infrastructure: Experience with Infrastructure as Code (IaC) tools such as AWS CloudFormation and/or Terraform is essential.
Data Management: Proficiency with SQL (PostgreSQL/Aurora) and object storage (S3).
Environment: Beyond comfortable working in Linux/Unix environments.

Attributes:

Meticulous: You care deeply about data integrity. A missing file or a broken checksum bothers you.
Ownership‑driven: You take responsibility for systems you build and operate.
Collaborative: You can work effectively within an established technical team, integrating your work with existing APIs and processing pipelines.

Preferred Experience

Experience with workflow managers like Nextflow or container orchestration via Kubernetes.
Experience with hybrid‑cloud data transfer tools (e.g., AWS DataSync, Storage Gateway).
Knowledge of searching/indexing tools like Elasticsearch or OpenSearch.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs