Enable job alerts via email!

Data Engineer (Control M, Hadoop, Spark, Hive, Big Data)

NEPTUNEZ SINGAPORE PTE. LTD.

Singapore

On-site

SGD 60,000 - 90,000

Full time

8 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading data solutions provider is seeking a skilled data engineer to design and optimize data pipelines for efficient data migration to Hadoop-based platforms. The role involves collaborating with cross-functional teams and ensuring data quality and operational stability. Candidates should be proficient in Python, SQL, and Hadoop ecosystem tools, ideally with experience in the financial services sector.

Qualifications

  • Proficiency in Python and SQL required.
  • Strong experience with Hadoop, Hive, and Spark is essential.
  • Background in banking or financial services preferred.

Responsibilities

  • Design, implement, and maintain data pipelines for data migration.
  • Develop and optimize data processing jobs using Spark, Hive, and Python.
  • Ensure end-to-end operational stability of data pipelines.

Skills

Python
SQL
Hadoop
Spark
Informatica
Data Management
Control-M

Job description

Responsibilities:

  • Design, implement, and maintain data pipelines for the migration of data from on-premises systems to Hadoop-based platforms.
  • Develop and optimize data processing jobs using Spark, Hive, and Python.
  • Manage job orchestration and scheduling using Control-M, ensuring timely and accurate data delivery.
  • Collaborate with cross-functional teams to understand data requirements and deliver efficient solutions.
  • Perform code quality checks and peer reviews to ensure best practices, maintainability, and adherence to coding standards.
  • Ensure end-to-end operational stability of data pipelines by proactively identifying and resolving bottlenecks, failures, and data quality issues.
  • Ensure data quality through rigorous cleaning and validation processes.
  • Documented data flow processes, transformation logic, and framework usage to support onboarding and troubleshooting.

Requirements:

  • Proficiency in Python and SQL.
  • Strong experience with Hadoop ecosystem tools (Hive, Spark).
  • Worked extensively with transformation components, mapping development, and workflow orchestration in Informatica/DataStage.
  • Experience with job scheduling and monitoring using Control-M.
  • Familiar with pipeline-as-code concepts and using Jenkins files for automation of build and deployment processes.
  • Solid understanding of database systems including Teradata, Oracle, and SQL Server.
  • Ability to analyze and troubleshoot large-scale data processing systems.
  • Experience in the banking or financial services industry.
  • Knowledge of data warehousing concepts and star/snowflake schema design.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.