Job Search and Career Advice Platform

Activez les alertes d’offres d’emploi par e-mail !

Open Data & Search Solutions Engineer

Blackfluo.ai

Paris

Sur place

EUR 50 000 - 70 000

Plein temps

Il y a 30+ jours

Générez un CV personnalisé en quelques minutes

Décrochez un entretien et gagnez plus. En savoir plus

Résumé du poste

An innovative tech company in France seeks a Data Discovery Engineer to enhance search and metadata capabilities across its platforms. The role involves deploying open-source search engines and managing metadata catalogs. Ideal candidates will have 6+ years of experience with relevant technologies and a strong understanding of data standards and interoperability. Join us to improve data discoverability and support data stewardship efforts.

Qualifications

  • 6+ years of experience deploying and operating open-source search engines.
  • Experience with metadata catalog platforms.
  • Strong understanding of metadata standards and vocabularies.

Responsabilités

  • Design and deploy open-source search engine solutions.
  • Deploy and maintain metadata catalog systems.
  • Implement Linked Open Data techniques.

Connaissances

Open-source search engines experience
Metadata catalog platforms experience
Understanding of metadata standards
Proficiency in SPARQL and JSON-LD
Knowledge of containerization tools
Scripting in Python, Java, or Scala

Formation

Bachelor's or Master's in Computer Science

Outils

Apache Solr
Elasticsearch
Apache Atlas
Description du poste
About the job Open Data & Search Solutions Engineer

We are seeking a Data Discovery Engineer to lead the development and integration of search, metadata cataloguing, and Linked Open Data (LoD) capabilities across our data platform ecosystem. This role will focus on implementing and managing open-source search engines (SolR, OpenSearch, Elastic), data cataloging tools (Atlas, NADA), and semantic web technologies (DCAT, RDF, schema.org, Croissant) to improve data discoverability, interoperability, and reuse.

Key Responsibilities
Search Platform Integration
  • Design and deploy open-source search engine solutions such as Apache Solr, OpenSearch, or Elasticsearch
  • Optimize indexing strategies for structured and unstructured data from diverse data sources
  • Develop custom search features (facets, filters, synonyms, auto-suggestions) tailored to metadata and dataset discovery
  • Implement scalable search pipelines with support for multilingual and full-text search capabilities
Metadata Cataloging & Data Discovery
  • Deploy and maintain metadata catalog systems such as Apache Atlas, NADA, or CKAN
  • Ensure metadata harvesting and harmonization across multiple sources using catalog APIs and connectors
  • Integrate catalog systems with enterprise data lakes, APIs, and external repositories
  • Establish metadata governance policies and data stewardship workflows in collaboration with data owners
Linked Open Data (LoD) Enablement

Implement Linked Open Data techniques using DCAT, RDF, schema.org, and W3C standards

  • Publish datasets and metadata as linked data endpoints for reuse and interoperability
  • Map internal metadata schemas to external ontologies (e.g., DCAT-AP, schema.org, Croissant)
  • Build SPARQL endpoints or graph-based access for semantic querying of data assets
Platform Automation & Integration
  • Develop and maintain pipelines for metadata ingestion, enrichment, validation, and publication
  • Integrate catalog and search platforms with analytics tools, APIs, and user-facing portals
  • Support deployment of semantic technologies using containers, CI/CD pipelines, and cloud-native tools
  • Collaborate with platform engineers, data stewards, and domain experts to deliver robust metadata solutions
Technical Skills
  • 6+ years experience deploying and operating open-source search engines (SolR, Elasticsearch, OpenSearch)
  • Experience with metadata catalog platforms such as Apache Atlas, NADA, CKAN, or equivalent
  • Strong understanding of metadata standards and vocabularies (DCAT, Dublin Core, schema.org, RDF, OWL)
  • Proficiency in SPARQL, JSON-LD, XML, and semantic mapping techniques
  • Familiarity with data discovery, information retrieval, and search tuning techniques
  • Experience integrating catalog and search systems with APIs and microservices
  • Working knowledge of containerization (Docker) and orchestration (Kubernetes)
  • Scripting or development skills in Python, Java, or Scala for search and metadata tooling
  • Familiarity with CI/CD pipelines (e.g., GitLab CI, GitHub Actions) and IaC tools (Terraform, Ansible)
Preferred Qualifications
  • Bachelor's or Master's degree in Computer Science, Information Science, Data Engineering, or related field
  • Experience working in open data, statistical, or public-sector environments
  • Knowledge of FAIR data principles, metadata quality frameworks, and interoperability standards
  • Exposure to graph databases (Blazegraph, GraphDB, Virtuoso) or triple stores
  • Contributions to or experience with semantic web and linked data communities
Obtenez votre examen gratuit et confidentiel de votre CV.
ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.