For our client, international consulting company expert in SAAS platforms building, we are looking for a :
Knowledge Graph & Data Infrastructure Engineer
Responsabilità
Key Responsibilities
- Design and implement the Strategy Knowledge Graph and its underlying schema.
- Build mechanisms to track how entities and relationships change over time.
- Set up a federated graph architecture that separates public knowledge from private client data.
- Define node and edge structures with essential metadata and provenance.
- Optimize graph queries and retrieval patterns for complex reasoning tasks.
- Manage and integrate multiple storage systems (graph, vector, metadata, documents, blobs).
- Maintain consistency across storage layers through event-driven synchronization.
- Build the hybrid retrieval API that unifies graph and vector results for the reasoning agents.
- Implement caching and partitioning strategies to ensure performance and multi-tenant scalability.
- Build ETL pipelines to ingest processed documents and transform them into graph entities and relationships.
- Implement data normalization, entity resolution, and incremental updates to keep the graph clean and scalable.
- Ensure data quality and temporal consistency across the knowledge base.
- Maintain full provenance so every fact is traceable to its source.
- Develop APIs (REST / GraphQL) for querying, updating, and interacting with the Knowledge Graph.
- Optimize database performance through indexing, caching, and query tuning.
- Implement access control, encryption, audit logging, and compliance mechanisms across the data layer.
Profilo
Experience & Background
- 3–6+ years of professional experience as a Data Engineer, Backend Engineer, or Knowledge Graph Engineer, working on production-grade data systems.
- Strong hands‑on experience with graph databases (e.g. Neo4j, TigerGraph, Amazon Neptune, or equivalent) including schema design, query optimization, and performance tuning.
- Solid experience designing and operating distributed data infrastructure in cloud environments (AWS, GCP, or Azure).
- Proven ability to design data models and schemas for complex domains, including entities, relationships, and metadata.
- Experience implementing ETL / ELT pipelines for structured and unstructured data, including incremental updates and backfills.
- Strong understanding of data consistency, versioning, and provenance in large-scale data systems.
- Experience integrating multiple storage systems (e.g., graph DB + relational DB + document store + object storage) into a coherent data platform.
- Practical experience building APIs for data access (REST and / or GraphQL) and integrating them into backend services.
- Familiarity with event‑driven architectures (e.g., message queues, streaming, async pipelines) for synchronizing distributed systems.
- Strong skills in Python, Java, or Scala for backend and data engineering workloads.
- Experience with query optimization, indexing strategies, caching, and partitioning for high‑performance data access.
- Understanding of security, access control, and data isolation in multi‑tenant SaaS platforms.
- Strong grasp of software engineering best practices : version control, testing, CI / CD, monitoring, and documentation.
«I candidati, nel rispetto del D.lgs. 198 / 2006, D.lgs. 215 / 2003 e D.lgs. 216 / 2003, sono invitati a leggere l’informativa sulla privacy consultabile sotto il form di richiesta dati della pagina di candidatura (Regolamento UE n. 2016 / 679).
L’offerta è rivolta a candidati di entrambi i sessi, nel rispetto del D.lgs. 198 / 2006 e della normativa vigente in materia di pari opportunità. L’azienda promuove un ambiente di lavoro inclusivo e garantisce pari opportunità a tutte le persone, indipendentemente da genere, età, origine, orientamento sessuale o disabilità.