Additional Information
- Must be willing to work on a hybrid setup, with onsite reporting to UP Ayala Technohub, Quezon City.
- Engagement is project-based for 6 months, with a possibility of extension.
- Work schedule is on a mid-shift and graveyard rotation.
- The role observes U.S. holidays instead of Philippine holidays.
Responsibilities
- Design, build, and maintain cloud-based data pipelines and workflows that support analytics and operational systems.
- Integrate data from various sources using APIs and cloud services.
- Develop clean, efficient, and test-driven code in Python for data ingestion and processing.
- Optimize data storage and retrieval using big data formats like Apache Parquet and ORC.
- Implement robust data models, including relational, dimensional, and NoSQL models.
- Collaborate with cross-functional teams to gather and refine requirements and deliver high-quality solutions.
- Deploy infrastructure using Infrastructure as Code (IaC) tools such as AWS CloudFormation or CDK.
- Monitor and orchestrate workflows using Apache Airflow or Dagster.
- Follow best practices in data governance, quality, and security.
Core Expertise
- Experience: At least 3 years in a data engineering role working on data integration, processing, and transformation use cases with open-source languages (i.e. Python) and cloud technologies.
- Strong programming skills in Python specifically for API integration and data libraries, with emphasis on quality and test-driven development.
- Demonstrated proficiency with big data storage formats (Apache Parquet, ORC) and practical knowledge of pitfalls and optimization strategies.
- Demonstrated proficiency with SQL
Data Modeling & AWS Knowledge
- Experience with data modeling: Relational modeling, Dimensional modeling, NoSQL modeling
- Working knowledge of IaC on AWS (CloudFormation or CDK)
- Working knowledge of AWS Services:
- Required: Glue, IAM, Lambda, DynamoDB, Step Functions, S3, CloudFormation or CDK
- Nice-to-have: Athena, Kinesis, MSK, MWAA, SQS
Orchestration & Data Streaming
- Experience with orchestration of data flows/pipelines: Apache Airflow or Dagster
- Nice-to-have: Data streaming (Kinesis, Kafka)
- Experience with Apache Spark
- Client-facing experience, multi-cultural team experience, technical leadership, team leadership
Key Competencies & Abilities
- Ability to work independently and collaboratively in a team environment.
- High level of attention to detail and commitment to delivering quality work.
- Strong analytical and critical thinking skills.
- Effective time management and organizational skills.
- Strong customer focus and the ability to communicate technical concepts clearly to stakeholders.
- Excellent written and verbal communication skills.