Enable job alerts via email!

Senior Data Engineer – Knowledge Graph & AI Platform

New Era Solutions

Hyderabad

Hybrid

INR 15,00,000 - 25,00,000

Full time

Yesterday

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm in India is looking for a Senior Data Engineer to build and maintain core data infrastructure for its enterprise AI platform. This role involves designing scalable data pipelines, developing knowledge graphs, and preparing both structured and unstructured data for AI applications. Applicants should have a strong background in Python, MongoDB, and data engineering practices. The position offers competitive compensation and flexible remote/hybrid work arrangements.

Benefits

Competitive compensation with equity options

Flexible remote/hybrid work setup

Learning budget and conference support

Qualifications

5+ years of Data Engineering experience with production-grade pipelines.
Strong Python skills with clean, testable, maintainable code.
MongoDB expertise including schema design and performance tuning.

Responsibilities

Build and maintain the core data infrastructure for an enterprise AI platform.
Design scalable data pipelines and develop knowledge graphs.
Prepare structured and unstructured data for AI and LLM-based applications.

Skills

Data Engineering experience

Python

MongoDB

Vector databases

Document processing

SQL

ETL/ELT at scale

Pipeline orchestration tools

Senior Data Engineer – Knowledge Graph & AI Platform

Location: Remote / Hybrid (India)

Employment Type: Full-Time

Reporting To: Platform Architect

Role Overview

The Senior Data Engineer will build and maintain the core data infrastructure for an enterprise AI platform. This role focuses on designing scalable data pipelines, developing knowledge graphs, and preparing structured and unstructured data for AI and LLM-based applications.

Roles & Responsibilities

Data Pipeline Development

Design and build scalable data ingestion pipelines from enterprise systems (ERP, documentation tools, version control, and project management tools)
Develop connectors for structured, semi-structured, and unstructured data
Implement incremental data loads, change data capture (CDC), and real-time sync
Ensure data quality through validation, deduplication, and lineage tracking

Knowledge Graph Engineering

Design ontologies and graph schemas for complex enterprise relationships
Implement entity resolution and relationship inference across data sources
Build APIs and query interfaces for graph traversal
Optimize graph storage and query performance for large-scale usage

Enterprise Data Integration

Extract and model enterprise metadata such as business rules and data dictionaries
Parse and semantically index documents and code artifacts
Build integrations with enterprise APIs and internal platforms

AI & LLM Data Infrastructure

Prepare structured and contextual data for LLM consumption
Design embedding strategies and manage vector databases for semantic search
Build memory and context management systems for stateful AI applications

Required Skills

Core Requirements

5+ years of Data Engineering experience with production-grade pipelines
Strong Python skills (clean, testable, maintainable code)
MongoDB expertise (schema design, aggregation pipelines, indexing, performance tuning)
Vector databases experience (Qdrant, Pinecone, Weaviate, pgvector)
Document processing experience (chunking, metadata extraction, PDFs/Word/HTML; LangChain or similar)
Strong SQL skills (complex queries, joins, window functions, optimization)
ETL/ELT at scale (incremental loads, CDC, idempotent pipelines)
Pipeline orchestration tools (Airflow, Dagster, Prefect, or similar)

Good to Have / Strong Plus

Experience building production RAG pipelines
Deep understanding of embedding models and dimensionality
Graph databases (Neo4j) and Cypher query expertise
LLM application development using LangChain or Lang Graph
Streaming systems (Kafka, Flink) for real-time pipelines
Hybrid search (vector + keyword/metadata filtering)
Apache Spark for large-scale transformations

What We Offer

Work on cutting-edge AI and knowledge graph technologies
Build foundational infrastructure for an enterprise AI platform
Competitive compensation with equity options
Flexible remote/hybrid work setup
Learning budget and conference support

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.