1. Feature Store Development (GCP Vertex AI)
- Build, maintain, and optimize Vertex AI Feature Store for online/offline feature serving.
- Implement feature ingestion, validation, transformation, and monitoring pipelines.
- Create automated feature quality checks, lineage, and feature documentation workflows.
- Collaborate with Data Science teams to onboard features for real-time and batch inference.
2. RAG, Agents & Memory Engineering
- Design and implement end-to-end RAG pipelines: chunking strategy, vectorization, metadata tagging, and data loaders.
- Build reusable RAG templates for multiple business use cases.
- Handle context and long-term memory, ephemerals context management, and session-level state sharing.
- Implement embedding pipelines (text, tables, PDFs, APIs) using Vertex AI, Gemini APIs, or custom embeddings.
- Develop AI agents with actions, tools, MCP protocol integration, and memory store connectivity.
3. MCP Data Tools Onboarding
- Integrate enterprise tools into the AI agent ecosystem using MCP-based data connectors.
- Build/extend MCP tools for access to BigQuery, Feature Store, GCS, Pub/Sub, internal APIs, etc.
- Ensure secure, audited access aligned with multi-tenant enterprise data standards.
4. Data & MLOps Engineering
- Develop scalable data pipelines using Python, Vertex AI Pipeline, Dataflow, Cloud Run, BigQuery.
- Automate model feature refreshes, sync to vector DBs, and agent memory stores.
- Build CI/CD workflows using Cloud Build, including Terraform-based infra automation.
5. Infrastructure as Code & DevOps
- Manage all infrastructure provisioning using Terraform for GCP services (IAM, VPC, Vertex AI, BQ, GCS, Artifact Registry).
- Implement monitoring, alerting, and health checks for pipelines and agent runtimes.
- Ensure strict security, compliance, and cost-optimization best practices.
Technical Skills (Core Skills)
Python (expert-level): Data pipelines, API integration, RAG frameworks, embedding workflows. GCP Services: Vertex AI, Feature Store, composer, BigQuery, GCS, Cloud Build, Cloud Run, Logging & Monitoring. AI/LLM Tools: Gemini, Vertex AI Search & Conversation, embeddings, vector stores (Pinecone/Vertex Matching Engine). CI/CD: Cloud Build, GitHub Actions, Terraform workflows. IaC: Terraform (GCP modules, reusable infra, security policies).
RAG & Agents
- Chunking strategies
- Indexing & vector DBs
- Embedding pipelines
- Session context, episodic memory, long-term memory
- RAG template creation and optimization
- MCP tooling