Data Engineer
Location: Singapore
Team: Data & Analytics
Reports to: Head of Data / Data Engineering Lead
Role Overview
We are looking for a Data Engineer with 2+ years of relevant experience to design, build, and operate scalable, reliable, production-grade data platforms that support analytics, machine learning, and business decision-making.
This role covers both real-time data streaming and batch processing, with a strong focus on
engineering quality, system reliability, and data freshness, primarily on Google Cloud Platform (GCP).
Key Responsibilities
1. Data Pipelines & Platform Engineering (Batch & Streaming)
- Design, build, and maintain batch and real-time data pipelines
- Work with Pub/Sub, Cloud Run, and BigQuery
- Develop data processing logic using Python (pandas, PySpark) and SQL
- Build real-time ingestion services supporting:Low-latency ingestionIdempotency and de-duplicationData validation and schema evolution
- Implement layered data architectures:Raw → Curated → Analytics-ready datasets
- Handle late-arriving data, replays, and historical backfills
2. Real-Time Data Streaming & Processing
- Participate in designing event-driven architectures
- Implement streaming logic for:Real-time / near-real-time aggregationsOperational and monitoring datasets
- Understand and apply exactly-once or effectively-once processing semantics
- Monitor streaming pipelines for latency, throughput, and failures
3. Data Modeling & Data Warehousing
- Design and maintain analytics-optimized BigQuery data models
- Apply appropriate:PartitioningClustering
- Support high-ingestion-rate tables and high-performance analytical queries
- Ensure schema consistency across development and production environments
4. Analytics & Machine Learning Enablement
- Build high-quality datasets for:Reporting and dashboardsTime-series analysisMachine learning feature generation
- Collaborate with analysts and data scientists to:Understand data requirementsValidate data accuracy and freshness
5. Cloud Infrastructure & Engineering Practices
- Containerize data services using Docker
- Build and deploy via Cloud Build and Artifact Registry
- Operate Cloud Run services and scheduled jobs
- Assist with:Service accounts and IAM rolesSecrets and environment configuration
- Contribute to CI/CD automation and deployment workflows
6. Data Quality, Governance & Reliability
- Implement data quality checks for both streaming and batch pipelines
- Help identify and resolve:Data delaysMissing or duplicate dataSchema breaking changes
- Maintain documentation, including:Data dictionariesStreaming architecture diagramsOperational runbooks
- Ensure pipelines are auditable, reproducible, and reliable
Required Qualifications
Minimum Requirements
- Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a related technical field
- 2+ years of experience in data engineering, backend engineering, or data platform roles
- Strong Python skills (pandas and/or PySpark)
- Solid SQL skills (BigQuery experience preferred)
- Hands-on experience building or maintaining production data pipelines
- Understanding of batch vs real-time streaming data processing concepts
Technical Competencies
- Familiarity with event-driven architectures
- Understanding of data modeling and data warehouse design
- Experience handling schema evolution and historical backfills
- Basic performance, scalability, and cost-optimization awareness
Engineering & DevOps Skills
- Experience with Docker and containerized applications
- Familiarity with Git-based development workflows
- Exposure to CI/CD pipelines
- Ability to troubleshoot and debug production issues
Nice to Have
- Experience with real-time streaming systems (Pub/Sub, Kafka, Dataflow)
- Exposure to time-series or near-real-time analytics
- Familiarity with:Dataflow / Apache BeamVertex AIBI tools such as Tableau or Looker
- Experience working with multi-region or multi-currency datasets
What Success Looks Like
- Data pipelines run reliably and with low latency
- Streaming and batch datasets are consistent and trustworthy
- Data freshness SLAs are met
- Downstream analytics and ML teams confidently rely on the data platform
Why Join Us
- Work on modern real-time data platforms
- Clear growth path toward Senior Data Engineer
- Strong engineering ownership and technical depth
- Cloud-native environment focused on long‑term maintainability
数据工程师(Data Engineer)
工作地点:新加坡
团队:数据与分析团队
汇报对象:数据负责人 / 数据
岗位概述
我们正在招聘一名 具有 2 年及以上相关经验的数据工程师,负责设计、构建和维护 可扩展、稳定、生产级的数据平台,为数据分析、机器学习和业务决策提供可靠的数据支持。该岗位将同时覆盖 实时数据流(Real-time Streaming)与离线批处理(Batch Processing),技术栈以 Google Cloud Platform(GCP) 为核心,强调 工程质量、系统稳定性与数据时效性。
工作职责
1. 数据管道与平台建设(批处理 + 实时流)
- 设计并维护 实时与离线数据管道
- 使用 Pub/Sub、Cloud Run、BigQuery
- 使用 Python(pandas、PySpark)与 SQL 进行数据处理
- 构建 实时数据接入服务,支持:低延迟写入幂等处理与去重数据校验与 Schema 演进
- 落地分层数据架构:原始层(Raw)→ 清洗层(Curated)→ 分析层(Analytics-ready)
- 处理 延迟数据、乱序数据、数据回放与历史补数
2. 实时数据流与流式处理
- 参与设计并实现 事件驱动架构
- 实现流式处理逻辑,包括:实时/准实时指标计算实时监控与运营数据集
- 理解并实践 Exactly-once 或 Effectively-once 处理语义
- 监控实时数据链路的延迟、吞吐量与异常情况
3. 数据建模与数据仓库
- 设计和维护 BigQuery 分析型数据模型
- 合理使用分区(Partitioning)与聚簇(Clustering)
- 支持高频写入与高性能分析查询
- 保证 开发环境与生产环境 Schema 一致性
4. 数据分析与机器学习支持
- 构建可复用的数据集,用于:报表与分析时间序列分析机器学习特征工程
- 与数据分析师、数据科学家协作:理解数据需求校验数据准确性与时效性
5. 云基础设施与工程化
- 使用 Docker 构建和部署数据服务
- 通过 Cloud Build / Artifact Registry 进行版本管理
- 运行 Cloud Run 服务与定时任务
- 配合完成:Service Account 与 IAM 权限配置环境变量与密钥管理
- 参与 CI/CD 自动化流程建设
6. 数据质量、治理与稳定性
- 实施数据质量校验与监控
- 协助发现并处理:数据延迟数据缺失Schema 变更风险
- 维护数据文档、数据流图与运维说明
- 确保数据 可追溯、可复现
任职要求
基本要求(必须)
- 计算机科学、数据工程、信息系统或相关技术专业本科及以上学历
- 2 年及以上数据工程 / 后端 / 数据平台相关工作经验
- 熟练使用 Python(pandas / PySpark 至少其一)
- 熟练编写 SQL(有 BigQuery 经验优先)
- 有 真实生产环境 的数据管道建设或维护经验
- 理解 批处理与实时流处理 的基本原理
- 熟悉事件驱动架构与流式处理思路
- 理解数据建模与数据仓库设计
- 能处理 Schema 演进与历史数据补数
- 具备基础的性能与成本意识
工程能力要求
- 具备 Docker 使用经验
- 熟悉 Git 基本工作流
- 能配合 CI/CD 流程完成部署
- 具备基础的问题定位与排查能力
加分项(Nice to Have)
- 有 实时数据流系统 经验(Pub/Sub / Kafka / Dataflow)
- 有时间序列或准实时分析经验
- 熟悉:Dataflow / Apache BeamVertex AIBI 工具(Tableau / Looker)
- 有多区域或多币种数据处理经验
成功标准
- 能稳定交付 可运行、可维护的数据管道
- 实时与离线数据链路稳定、可监控
- 数据质量满足分析与下游使用要求
- 能独立承担中等复杂度的数据工程任务
为什么加入我们
- 深入参与 实时数据平台与核心数据系统建设
- 技术成长路径清晰,工程实践扎实
- 有机会向高级数据工程师 / 技术专家发展
- 云原生技术栈,强调工程质量与长期维护