Activez les alertes d’offres d’emploi par e-mail !

Open Data & Search Solutions Engineer

Blackfluo.ai

Paris

Sur place

EUR 50 000 - 70 000

Plein temps

Il y a 30+ jours

Générez un CV personnalisé en quelques minutes

Décrochez un entretien et gagnez plus. En savoir plus

Résumé du poste

An innovative tech company in France seeks a Data Discovery Engineer to enhance search and metadata capabilities across its platforms. The role involves deploying open-source search engines and managing metadata catalogs. Ideal candidates will have 6+ years of experience with relevant technologies and a strong understanding of data standards and interoperability. Join us to improve data discoverability and support data stewardship efforts.

Qualifications

6+ years of experience deploying and operating open-source search engines.
Experience with metadata catalog platforms.
Strong understanding of metadata standards and vocabularies.

Responsabilités

Design and deploy open-source search engine solutions.
Deploy and maintain metadata catalog systems.
Implement Linked Open Data techniques.

Connaissances

Open-source search engines experience

Metadata catalog platforms experience

Understanding of metadata standards

Proficiency in SPARQL and JSON-LD

Knowledge of containerization tools

Scripting in Python, Java, or Scala

Formation

Bachelor's or Master's in Computer Science

Outils

Apache Solr

Elasticsearch

Apache Atlas

About the job Open Data & Search Solutions Engineer

We are seeking a Data Discovery Engineer to lead the development and integration of search, metadata cataloguing, and Linked Open Data (LoD) capabilities across our data platform ecosystem. This role will focus on implementing and managing open-source search engines (SolR, OpenSearch, Elastic), data cataloging tools (Atlas, NADA), and semantic web technologies (DCAT, RDF, schema.org, Croissant) to improve data discoverability, interoperability, and reuse.

Key Responsibilities

Search Platform Integration

Design and deploy open-source search engine solutions such as Apache Solr, OpenSearch, or Elasticsearch
Optimize indexing strategies for structured and unstructured data from diverse data sources
Develop custom search features (facets, filters, synonyms, auto-suggestions) tailored to metadata and dataset discovery
Implement scalable search pipelines with support for multilingual and full-text search capabilities

Metadata Cataloging & Data Discovery

Deploy and maintain metadata catalog systems such as Apache Atlas, NADA, or CKAN
Ensure metadata harvesting and harmonization across multiple sources using catalog APIs and connectors
Integrate catalog systems with enterprise data lakes, APIs, and external repositories
Establish metadata governance policies and data stewardship workflows in collaboration with data owners

Linked Open Data (LoD) Enablement

Implement Linked Open Data techniques using DCAT, RDF, schema.org, and W3C standards

Publish datasets and metadata as linked data endpoints for reuse and interoperability
Map internal metadata schemas to external ontologies (e.g., DCAT-AP, schema.org, Croissant)
Build SPARQL endpoints or graph-based access for semantic querying of data assets

Platform Automation & Integration

Develop and maintain pipelines for metadata ingestion, enrichment, validation, and publication
Integrate catalog and search platforms with analytics tools, APIs, and user-facing portals
Support deployment of semantic technologies using containers, CI/CD pipelines, and cloud-native tools
Collaborate with platform engineers, data stewards, and domain experts to deliver robust metadata solutions

Technical Skills

6+ years experience deploying and operating open-source search engines (SolR, Elasticsearch, OpenSearch)
Experience with metadata catalog platforms such as Apache Atlas, NADA, CKAN, or equivalent
Strong understanding of metadata standards and vocabularies (DCAT, Dublin Core, schema.org, RDF, OWL)
Proficiency in SPARQL, JSON-LD, XML, and semantic mapping techniques
Familiarity with data discovery, information retrieval, and search tuning techniques
Experience integrating catalog and search systems with APIs and microservices
Working knowledge of containerization (Docker) and orchestration (Kubernetes)
Scripting or development skills in Python, Java, or Scala for search and metadata tooling
Familiarity with CI/CD pipelines (e.g., GitLab CI, GitHub Actions) and IaC tools (Terraform, Ansible)

Preferred Qualifications

Bachelor's or Master's degree in Computer Science, Information Science, Data Engineering, or related field
Experience working in open data, statistical, or public-sector environments
Knowledge of FAIR data principles, metadata quality frameworks, and interoperability standards
Exposure to graph databases (Blazegraph, GraphDB, Virtuoso) or triple stores
Contributions to or experience with semantic web and linked data communities

Obtenez votre examen gratuit et confidentiel de votre CV.

ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.

Noté « Excellent » sur la base de 19 929 évaluations