In this role, you will design and build data governance tooling, metadata workflows, and automation that ensure data quality, lineage, compliance, and consistency across complex data environments.
Responsibilities
- Architect, build, and maintain data governance services and workflow automation (APIs, metadata ingestion pipelines, lineage services, classification/tagging systems).
- Develop and optimize ETL/ELT data pipelines with embedded data quality and governance controls.
- Integrate and enhance data catalog/metadata tools (e.g., Collibra, Alation, DataHub, or internal equivalents).
- Support development of data dictionaries, data definitions, lineage documentation, and business-to-technical mapping.
- Implement monitoring, alerting, and governance dashboards to track data quality KPIs and metadata coverage.
- Ensure compliance with data governance policies, privacy standards, and data access controls (e.g., GDPR, CCPA).
- Partner with data engineering, analytics, product, and privacy teams to translate governance requirements into scalable technical solutions.
- Create and maintain documentation, system designs, and operational processes.
Required Qualifications
- Bachelor’s degree in Computer Science, Engineering, Information Systems, or equivalent work experience.
- 3+ years of experience in software development (Java, Python, Scala, or similar languages).
- Experience working with distributed data platforms (e.g., Spark, Hive, Snowflake, BigQuery, Presto, Redshift).
- Hands‑on experience with data governance, metadata management, or data lineage systems.
- Strong SQL proficiency and familiarity with data pipeline/orchestration tools.
- Experience with cloud environments (AWS, GCP, or Azure).
- Excellent communication skills and ability to collaborate across technical and non-technical teams.
- Self‑driven, organized, and comfortable operating in a fast‑paced environment.
Preferred Qualifications
- Experience working in streaming media, consumer data, or other large‑scale data environments.
- Familiarity with data catalog platforms (Collibra, Alation, DataHub).
- Experience with Airflow or other orchestration tools.