Role: Agentic Data Engineer
Duration: 7+ months
Location: Richmond, VA
Details: Remote
Seeking a highly skilled Agentic Data Engineer to design, develop, and deploy data pipelines that leverage agentic AI to solve real-world problems. The ideal candidate will have experience in designing data processes to support agentic systems, ensure data quality, and facilitate interaction between agents and data.
Responsibilities:
- Design and develop data pipelines for agentic systems, creating robust data flows to manage complex interactions between AI agents and data sources.
- Train and fine-tune large language models.
- Design and build data architecture, including databases and data lakes, to support various data engineering tasks.
- Develop and manage Extract, Load, Transform (ELT) processes to ensure accurate and efficient data movement from sources to analytical platforms.
- Implement data pipelines that enable feedback loops, incorporating human input to improve system performance in human-in-the-loop systems.
- Work with vector databases to store and retrieve embeddings efficiently.
- Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications.
- Optimize data storage and retrieval for high performance.
- Perform statistical analysis to identify trends and patterns, creating data formats from multiple sources.
Qualifications:
- Strong fundamentals in data engineering.
- Experience with big data frameworks like Spark or Databricks.
- Experience training LLMs with structured and unstructured datasets.
- Understanding of Graph Databases.
- Experience with Azure Blob Storage, Azure Data Lakes, Azure Databricks.
- Experience implementing Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure OpenAI models, Azure Media Services, and Azure AI Search.
- Ability to determine effective data partitioning criteria and implement partition schemes using Spark.
- Understanding of core machine learning concepts and algorithms.
- Familiarity with cloud computing skills.
- Strong programming skills in Python and experience with AI/ML frameworks.
- Proficiency with vector databases and embedding models for retrieval tasks.
- Expertise in integrating with AI agent frameworks.
- Experience with cloud AI services, particularly Azure AI.
- Experience working with GIS spatial data, including mapping and geolocation tasks.
- Experience developing AI solutions for Department of Transportation data domains, including data modeling, correlation, hypothesis validation, forecasting, and what-if analysis.
- Bachelor's or master's degree in computer science, AI, Data Science, or a related field.