Knowledge Graph & Data Infrastructure Engineer
For our client, an international consulting company expert in building SAAS platforms, we are seeking a Knowledge Graph & Data Infrastructure Engineer.
Responsibilities
- Design and implement the strategy, architecture, and underlying schema for a Knowledge Graph.
- Build mechanisms to track how entities and relationships change over time.
- Set up a federated graph architecture that separates public knowledge from private client data.
- Define node and edge structures with essential metadata and provenance.
- Optimize graph queries and retrieval patterns for complex reasoning tasks.
- Manage and integrate multiple storage systems (graph, vector, metadata, documents, blobs).
- Maintain consistency across storage layers through event‑driven synchronization.
- Build a hybrid retrieval API that unifies graph and vector results for reasoning agents.
- Implement caching and partitioning strategies to ensure performance and multi‑tenant scalability.
- Build ETL pipelines to ingest processed documents and transform them into graph entities and relationships.
- Implement data normalization, entity resolution, and incremental updates to keep the graph clean and scalable.
- Ensure data quality and temporal consistency across the knowledge base.
- Maintain full provenance so every fact is traceable to its source.
- Develop APIs (REST/GraphQL) for querying, updating, and interacting with the Knowledge Graph.
- Optimize database performance through indexing, caching, and query tuning.
- Implement access control, encryption, audit logging, and compliance mechanisms across the data layer.
Qualifications
- 3–6+ years of professional experience as a Data Engineer, Backend Engineer, or Knowledge Graph Engineer, working on production‑grade data systems.
- Strong hands‑on experience with graph databases (e.g., Neo4j, TigerGraph, Amazon Neptune, or equivalent), including schema design, query optimization, and performance tuning.
- Solid experience designing and operating distributed data infrastructure in cloud environments (AWS, GCP, or Azure).
- Proven ability to design data models and schemas for complex domains, including entities, relationships, and metadata.
- Experience implementing ETL/ELT pipelines for structured and unstructured data, including incremental updates and backfills.
- Strong understanding of data consistency, versioning, and provenance in large‑scale data systems.
- Experience integrating multiple storage systems (e.g., graph DB + relational DB + document store + object storage) into a coherent data platform.
- Practical experience building APIs for data access (REST and/or GraphQL) and integrating them into backend services.
- Familiarity with event‑driven architectures (e.g., message queues, streaming, async pipelines) for synchronizing distributed systems.
- Strong skills in Python, Java, or Scala for backend and data engineering workloads.
- Experience with query optimization, indexing strategies, caching, and partitioning for high‑performance data access.
- Understanding of security, access control, and data isolation in multi‑tenant SaaS platforms.
- Strong grasp of software engineering best practices: version control, testing, CI/CD, monitoring, and documentation.
«I candidati, nel rispetto del D.Lgs. 198/2006, D.Lgs. 215/2003 e D.Lgs. 216/2003, sono invitati a leggere l’informativa sulla privacy consultabile sotto il form di richiesta dati della pagina di candidatura (Regolamento UE n. 2016/679). L’offerta è rivolta a candidati di entrambi i sessi, nel rispetto del D.Lgs. 198/2006 e della normativa vigente in materia di pari opportunità. L’azienda promuove un ambiente di lavoro inclusivo e garantisce pari opportunità a tutte le persone, indipendentemente da genere, età, origine, orientamento sessuale o disabilità.»