Enable job alerts via email!

Data Engineer (Control M, Hadoop, Spark, Hive, Big Data)

NEPTUNEZ SINGAPORE PTE. LTD.

Singapore

On-site

SGD 60,000 - 90,000

Full time

8 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading data solutions provider is seeking a skilled data engineer to design and optimize data pipelines for efficient data migration to Hadoop-based platforms. The role involves collaborating with cross-functional teams and ensuring data quality and operational stability. Candidates should be proficient in Python, SQL, and Hadoop ecosystem tools, ideally with experience in the financial services sector.

Qualifications

Proficiency in Python and SQL required.
Strong experience with Hadoop, Hive, and Spark is essential.
Background in banking or financial services preferred.

Responsibilities

Design, implement, and maintain data pipelines for data migration.
Develop and optimize data processing jobs using Spark, Hive, and Python.
Ensure end-to-end operational stability of data pipelines.

Skills

Python

SQL

Hadoop

Spark

Informatica

Data Management

Control-M

Responsibilities:

Design, implement, and maintain data pipelines for the migration of data from on-premises systems to Hadoop-based platforms.
Develop and optimize data processing jobs using Spark, Hive, and Python.
Manage job orchestration and scheduling using Control-M, ensuring timely and accurate data delivery.
Collaborate with cross-functional teams to understand data requirements and deliver efficient solutions.
Perform code quality checks and peer reviews to ensure best practices, maintainability, and adherence to coding standards.
Ensure end-to-end operational stability of data pipelines by proactively identifying and resolving bottlenecks, failures, and data quality issues.
Ensure data quality through rigorous cleaning and validation processes.
Documented data flow processes, transformation logic, and framework usage to support onboarding and troubleshooting.

Requirements:

Proficiency in Python and SQL.
Strong experience with Hadoop ecosystem tools (Hive, Spark).
Worked extensively with transformation components, mapping development, and workflow orchestration in Informatica/DataStage.
Experience with job scheduling and monitoring using Control-M.
Familiar with pipeline-as-code concepts and using Jenkins files for automation of build and deployment processes.
Solid understanding of database systems including Teradata, Oracle, and SQL Server.
Ability to analyze and troubleshoot large-scale data processing systems.
Experience in the banking or financial services industry.
Knowledge of data warehousing concepts and star/snowflake schema design.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.