Enable job alerts via email!

Senior / Lead Data Engineer (Python, Kafka, Iceberg, ClickHouse)

VANGUARD SOFTWARE PTE. LTD.

Singapore

On-site

SGD 80,000 - 120,000

Full time

13 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in Singapore is seeking a Senior/Lead Data Engineer to enhance their data infrastructure and analytical capabilities. This role involves designing scalable data pipelines, leading a team of engineers, and ensuring robust compliance and performance tuning of complex data workflows. Candidates should possess a strong technical foundation in data engineering with extensive experience in leading teams and optimizing data systems.

Qualifications

6+ years of experience in data engineering, including 2+ years in leadership.
Experience with lakehouse architectures (e.g., Iceberg, Delta Lake) and data warehouses (e.g., Hive, ClickHouse).
Self-motivated, adaptable, with strong ownership mindset.

Responsibilities

Design and implement scalable data pipelines using various tools.
Mentor junior engineers and promote best practices.
Work cross-functionally to translate data needs into solutions.

Skills

Python

Java

SQL

Performance tuning

Data architecture

Data engineering

Mentoring

Education

Bachelor’s degree in Computer Science

Information Systems

Tools

Flink

Spark

Debezium

Seatunnel

Iceberg

ClickHouse

Kafka

Airflow

DolphinScheduler

JOB SUMMARY

We are looking for a talented and experienced Senior/Lead Data Engineer to join our innovative team. The Senior/Lead Data Engineer will play a critical role in leading the design, development, and maintenance of our data infrastructure, pipelines, and analytical tools. The ideal candidate will have a strong technical background in data engineering, expertise in analytical tools, and proven leadership skills to mentor and guide a team of data engineers.

JOB DUTIES

Design and implement scalable batch and streaming data pipelines using tools such as Flink, Spark, Debezium, and Seatunnel (an open-source data integration tool).
Ingest and process high-volume data from APIs, operational databases, and semi-structured formats (e.g., JSON, CSV, XML, logs) to support diverse analytical use cases.
Build reusable transformation pipelines to consolidate cross-domain data (e.g., user behavior, transactions) into analytics-ready marts.
Architect and optimize data storage and modeling layers using MinIO/S3, Iceberg, ClickHouse, and other OLAP or object storage platforms to improve query performance and data reliability.
Maintain multi-layered data warehouse architecture (staging, core, mart) aligned with business needs.
Ensure robust CI/CD, lineage, observability, and compliance through tools like OpenMetadata and DataHub.
Mentor junior engineers, conduct rigorous code reviews, and promote engineering best practices across the data team.
Work cross-functionally with product managers, analysts, business stakeholders to translate data needs into scalable pipelines and business insights.
Stay current with data engineering trends and technologies, and continuously drive platform improvements.

JOB REQUIREMENTS

Bachelor’s degree in Computer Science, Information Systems, or equivalent qualification.
6+ years of experience in data engineering, including 2+ years in a technical lead or senior IC capacity.
Proficient in Python, Java, and SQL, with strong expertise in schema design, performance tuning, and warehouse modeling.
Hands-on experience with lakehouse architectures (e.g., Iceberg, Delta Lake), data warehouses (e.g., Hive, ClickHouse), and object storage (e.g., AWS S3, MinIO).
Strong knowledge of orchestration tools (Airflow, DolphinScheduler), ETL/ELT design, and streaming frameworks (Kafka, Flink, Spark).
Proven experience independently setting up and managing end-to-end data architecture in on-premise environments.
Demonstrated success mentoring engineers in high-performance, cross-functional teams.
Familiarity with Git-based workflows, CI/CD pipelines, and observability tools for production-grade data systems.
Self-motivated with a strong ownership mindset, adaptability, and willingness to travel when needed.