Singapore
On-site
SGD 70,000 - 100,000
Full time
15 days ago
Job summary
A technology company in Singapore is seeking a Data Engineer to design and maintain ETL pipelines using Spark and Hadoop, enabling business insights from large-scale data. The ideal candidate has 3-6 years of experience and strong skills in data engineering tools and languages. A collaborative team environment awaits, focusing on data availability, optimization, and security.
Qualifications
- 3-6 years of experience as a Data Engineer or in a similar role.
- Strong hands-on knowledge of Apache Spark, Hadoop, Hive, HDFS, and SQL.
- Proficiency in Python / Scala / Java for data engineering tasks.
- Solid understanding of data warehousing concepts, ETL design patterns, and data modeling.
- Experience with workflow orchestration tools (Airflow, Oozie, etc.) is a plus.
- Familiarity with cloud platforms (Azure, AWS, or GCP) is advantageous.
- Strong problem-solving skills and ability to work in a collaborative team environment.
Responsibilities
- Design, develop, and maintain ETL pipelines and data workflows using Spark, Hadoop, Hive, and related frameworks.
- Work with large-scale structured and unstructured data to enable business insights.
- Collaborate with Data Scientists, Analysts, and other stakeholders to ensure data availability and quality.
- Implement data ingestion and transformation processes using distributed computing frameworks.
- Optimize performance of big data applications and troubleshoot issues in the data pipelines.
- Ensure data security, compliance, and governance across all platforms.
Skills
Apache Spark
Hadoop
Hive
SQL
Python
Scala
Java
Airflow
Cloud platforms (Azure, AWS, GCP)
Responsibilities
- Design, develop, and maintain ETL pipelines and data workflows using Spark, Hadoop, Hive, and related frameworks
- Work with large-scale structured and unstructured data to enable business insights
- Collaborate with Data Scientists, Analysts, and other stakeholders to ensure data availability and quality
- Implement data ingestion and transformation processes using distributed computing frameworks
- Optimize performance of big data applications and troubleshoot issues in the data pipelines
- Ensure data security, compliance, and governance across all platforms
Requirements
- 3-6 years of experience as a Data Engineer or in a similar role
- Strong hands-on knowledge of Apache Spark, Hadoop, Hive, HDFS, and SQL
- Proficiency in Python / Scala / Java for data engineering tasks
- Solid understanding of data warehousing concepts, ETL design patterns, and data modeling
- Experience with workflow orchestration tools (Airflow, Oozie, etc.) is a plus
- Familiarity with cloud platforms (Azure, AWS, or GCP) is advantageous
- Strong problem-solving skills and ability to work in a collaborative team environment