Job Summary
Responsible for the execution and management of Yayasan Peneraju Data Platform. This encompasses platform operations and development, including data intake, data processing, data quality assessment/evaluation, data curation and enrichment processes.
Job Description
- Design, build and maintain scalable and cost‑efficient data pipelines to capture Yayasan Peneraju data across various sources, formats and systems, including structured and unstructured data.
- Assemble large, complex data sets that meet functional/non‑functional business requirements.
- Monitor performance and perform maintenance, performance tuning and optimization of data platforms and data pipelines.
- Resolve complex data pipeline failures, data quality issues and performance bottlenecks.
- Collaborate with stakeholders, business units and data analysts to support downstream data usage and translate business requirements into functioning data solutions.
- Identify, design and implement internal process improvements e.g., automating manual processes, optimizing data delivery, re‑designing infrastructure for greater scalability, etc.
- Assist DataOps Lead during design and development of data architecture, as well as contribute during technical discussions.
- Build processes supporting data transformation, data structures, metadata, dependency, and workload management.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources.
- Responsible for disaster recovery or business continuity planning, testing and activation in the event of data service interruption or platform outage.
- Adhere and comply with Information Security Policy implemented by Yayasan Peneraju and/or applicable statutory or legislative regulations adopted by Yayasan Peneraju.
- Perform other related duties as assigned.
Job Requirements
- Minimum of Bachelor’s degree holder in data engineering, big data analytics, computer engineering or relevant qualifications.
- Minimum of 2-4 years of working experience in related field.
Specialized skills
- Advanced working SQL knowledge and experience in working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience with data pipeline and workflow management tools: Airflow, Azkaban, Luigi, etc.
- Experience in developing ETLs in big data technologies like: Hadoop, Spark, HBase, Cassandra, MongoDB, Kafka, Redis and more.
- Fluent in various operating systems (Windows Server, Linux based OSes) and its system modules (networking, firewall and storage).
- Experience in various container technologies (Kubernetes, Docker).
- Familiar with data operations that deal with data in motion, data at rest, data sets and the interaction between data-dependent processes and applications that falls under the scope of data architecture.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Understanding of machine learning algorithms, data science concepts, statistical analysis and data modeling.
- Knowledge and experience of delivering CI/CD and DevOps capabilities in a data-intensive environment.
- Familiarity with DataOps, ITSM, SRE and DevOps methodologies and able to adapt it to Yayasan Peneraju’s data delivery and operations.