Job Search and Career Advice Platform

Aktiviere Job-Benachrichtigungen per E-Mail!

Data Catalog Implementation Specialist (Pilot Lead) (m/w/d)

Michael Page

Remote

EUR 60.000 - 80.000

Vollzeit

Heute
Sei unter den ersten Bewerbenden

Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf

Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren

Zusammenfassung

A leading recruitment agency is seeking a Data Catalog Implementation Specialist (Pilot Lead) to lead a pilot project focused on democratizing data access. The role involves deploying a data catalog, integrating it with Databricks, and ensuring secure connectivity. Ideal candidates should have strong proficiency with Databricks, experience in configuring data governance platforms like Atlan or DataHub, and an understanding of metadata management. This full-time position allows for majority remote work with responsibilities that enhance data accessibility for the Data Science team.

Qualifikationen

  • Proven experience configuring modern data governance platforms, specifically Atlan and/or DataHub.
  • Hands-on knowledge of the DataHub open-source catalog framework.
  • Strong proficiency with Databricks (Lakehouse architecture, Delta Lake, Unity Catalog) and Spark SQL.
  • Expertise in metadata ingestion frameworks and API-based integration (REST/GraphQL).
  • Proficiency in Python or SQL for custom connector configuration and metadata manipulation.
  • Familiarity with data stewardship principles.

Aufgaben

  • Deploy and configure the initial instance of the data catalog (e.g., Atlan or DataHub) for a Proof of Concept.
  • Consult on how effectively the data catalog meets organizational requirements.
  • Establish secure connectivity between the data catalog and Databricks Unity Catalog/Delta Lake.
  • Execute automated metadata extraction and implement tagging strategies.
  • Design a streamlined workflow enabling Data Scientists to search, query, and request access to datasets.

Kenntnisse

Data Cataloging
DataHub Open Source Framework
Data Platforms
Metadata Management
Scripting
Governance
Jobbeschreibung
Data Catalog Implementation Specialist (Pilot Lead) (m/w/d)

We are seeking a hands‑on Data Catalog Expert to lead a critical pilot project focused on democratizing data access for our Data Science team. The engagement involves setting up an initial data catalog instance and integrating it with our Databricks environment to register key source datasets. The objective is to demonstrate the value of a centralized metadata repository by transforming raw table lists into a searchable, context‑rich asset library that accelerates model development and analytics.

Start: 05.01
Project Duration: 03 Months +
Workload: 5 days / per week
Location: Remote (95%) Düsseldorf
Industry: Sales
Project language: English/German (nice to have)

Key Responsibilities
  • Deploy and configure the initial instance of the data catalog (e.g., Atlan or DataHub) to support a Proof of Concept (PoC) scope, working with the DataHub open‑source framework within our ecosystem.
  • Consult on how effectively the data catalog meets organizational requirements and identify gaps.
  • Establish secure connectivity between the data catalog and Databricks Unity Catalog/Delta Lake to automate ingestion of schemas, tables, and views.
  • Execute automated metadata extraction and implement tagging strategies to classify sensitive data (PII) and add business context such as descriptions and owners.
  • Design a streamlined workflow enabling Data Scientists to search, query, and request access to datasets directly through the catalog interface.
Required Technical Skills
  • Data Cataloging: Proven experience configuring modern data governance platforms, specifically Atlan and/or DataHub.
  • DataHub Open Source Framework: Hands‑on knowledge of the DataHub open‑source catalog framework.
  • Data Platforms: Strong proficiency with Databricks (Lakehouse architecture, Delta Lake, Unity Catalog) and Spark SQL.
  • Metadata Management: Expertise in metadata ingestion frameworks, API‑based integration (REST/GraphQL), and automated classification policies.
  • Scripting: Proficiency in Python or SQL for custom connector configuration and metadata manipulation.
  • Governance: Familiarity with data stewardship principles, including ownership assignments, glossary creation, and certification workflows.
Pilot Deliverables
  • Functional connection between Databricks and the data catalog (Atlan/DataHub).
  • Registration of high‑priority source datasets with complete metadata (descriptions, tags).
  • Demonstrable “Search‑to‑Query” workflow for the Data Science team.
  • Final recommendation report on catalog scalability and long‑term architecture.

Contact: Duygu Sahin

Quote job ref: JN-122025-6901838

Seniority level: Associate
Employment type: Full‑time
Job function: Information Technology
Industries: Retail

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.