Enable job alerts via email!

GenAI Data Engineer

Motion Recruitment Partners LLC

Scituate (MA)

Remote

USD 100,000 - 130,000

Full time

3 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A fast-growing LegalTech company is seeking a mid-level Data Engineer to enhance their GenAI capabilities. The role involves building scalable data pipelines, managing vector databases, and collaborating with cross-functional teams to support legal industry applications. Candidates with 3-6 years of experience in data engineering and a strong proficiency in SQL and Python are preferred. This position is fully remote, offering flexible hours.

Benefits

Flexible hours

Impact on future of LegalTech

Collaboration with a passionate team

Qualifications

3-6 years of experience in data engineering or data infrastructure roles.
Proficiency in SQL and Python, especially for AI/data pipelines.
Hands-on experience with AWS services (S3, Lambda, RDS, ECS).

Responsibilities

Build, optimize, and maintain scalable data pipelines and ETL processes.
Implement RAG-based GenAI workflows using LangChain and OpenAI.
Integrate and manage vector databases (e.g., Pinecone, Weaviate, FAISS).

Skills

SQL

Python

AWS services

LLM tools and frameworks

Data architecture

Data normalization

GenAI Data Engineer

Location: Fully Remote (U.S.-based, New England preferred)
Industry: LegalTech / SaaS
Level: Mid-Level (3–6 years experience)

About Us
We’re a fast-growing LegalTech and analytics SaaS company used by the nation’s top law firms, law schools, and legal recruiters to track and analyze legal talent flows. Our platform helps drive smarter hiring, retention, and market insights in the legal industry using modern data infrastructure and emerging AI capabilities.

As we expand our product offering with more GenAI features, we’re looking for a mid-level Data Engineer with hands-on experience in LLMs, RAG architecture, and vector databases to help us scale.

What You’ll Do

Build, optimize, and maintain scalable data pipelines and ETL processes
Implement RAG-based GenAI workflows using tools like LangChain and OpenAI
Integrate and manage vector databases (e.g., Pinecone, Weaviate, FAISS)
Work with structured and unstructured data to support analytics and AI-driven search
Collaborate cross-functionally with backend engineers, product, and data science
Support legal industry-specific use cases like entity resolution, summarization, and document classification

What We’re Looking For

3–6 years of experience in data engineering or data infrastructure roles
Proficiency in SQL and Python (especially for AI/data pipelines)
Hands-on experience with AWS services (e.g., S3, Lambda, RDS, ECS)
Experience with LLM tools and frameworks (e.g., LangChain, LlamaIndex, OpenAI APIs)
Comfortable working with vector databases and retrieval-based AI
Strong understanding of scalable data architecture and data normalization

Nice-to-Haves

Experience with RAG pipelines in production
Familiarity with legal data or professional services industries
Exposure to orchestration tools like Airflow, Prefect, or similar
Experience with embeddings, chunking strategies, and prompt engineering

Why Join Us?

Make an impact on the future of LegalTech and AI-powered analytics
Collaborate with a small, passionate, high-performance team
Own meaningful projects that ship fast and evolve quickly
100% remote with flexible hours (New England-based candidates preferred)

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs