Enable job alerts via email!

Principal Data Architect

Ignis AI

United States

Remote

USD 170,000 - 185,000

Full time

10 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative company is seeking a Principal Data Architect to lead the design of a cutting-edge data platform. In this pivotal role, you will architect solutions that enhance data workflows and support machine learning initiatives. You'll collaborate with cross-functional teams to ensure high-quality data governance and build scalable data pipelines. If you are passionate about shaping the future of data architecture and thrive in a collaborative environment, this opportunity is perfect for you. Join us in transforming the landscape of talent acquisition and management through intelligent data solutions.

Benefits

401(k) plans

Medical insurance

Dental insurance

Vision insurance

Paid time off

Qualifications

7+ years of experience in data engineering and architecture.
Proven experience in designing enterprise-grade data platforms.
Strong command of data modeling and pipeline orchestration.

Responsibilities

Architect and evolve the data platform for various data types.
Build and optimize pipelines for LLM workflows.
Collaborate with ML engineers to operationalize data pipelines.

Skills

Data Engineering

Data Architecture

Python

SQL

Bash

Kafka

Redis

Machine Learning

Data Governance

Data Pipelines

Education

Bachelor's Degree in Computer Science or related field

Master's Degree in Data Science or related field

Tools

Airflow

Dagster

Prefect

LangChain

Hugging Face

OpenAI

Cohere

Great Expectations

OpenMetadata

DataHub

Get AI-powered advice on this job and more exclusive features.

The AI revolution is here, transforming the workplace at an unprecedented pace. Routine tasks are being automated, driving efficiency but also reshaping the job market. This means new opportunities are emerging and traditional roles are evolving. Ignis AI is building a next generation of Talent Acquisition and Talent Management systems that empower individuals and businesses to thrive in this changing landscape. Our goal is to enable individuals to advance their careers and to allow organizations to adapt seamlessly to the evolving workforce demands. By leveling the workforce field, we help everyone to unlock their potential and achieve success. At Ignis AI, we embrace skills-based hiring, including skills such as creativity, communication and collaboration.

We’re building a real-time, intelligent platform powered by machine learning and large language models (LLMs). Our foundation is a robust data architecture that supports everything from analytics to LLM-driven applications, and we’re looking for a Principal Data Architect to lead that foundation. This role combines classic data engineering excellence with next-generation challenges around LLM readiness, data pipelines for embeddings, and retrieval-augmented generation (RAG) systems. As our Principal Data Architect, you’ll play a foundational role in designing the infrastructure, workflows, and data culture that powers our entire product ecosystem.

This is a remote role based in the United States.

Ideal candidates are based in the East Coast Area.

Occasional travel is anticipated.

Job Responsibilities:

Architect and evolve our data platform to support structured, semi-structured, and unstructured data pipelines across real-time and batch workloads.
Build and optimize pipelines that serve LLM fine-tuning, inference, and retrieval workflows, including preprocessing text, generating embeddings, and chunking documents for context injection.
Collaborate with ML engineers to operationalize RAG pipelines, feature stores, and model inputs from production data streams.
Own and define data contracts, schemas, lineage, and quality enforcement across the platform.
Own data infrastructure end-to-end — from ingestion and transformation to cataloging, versioning, and quality enforcement.
Design and implement streaming ingestion pipelines with Kafka or Redis for low-latency use cases.
Implement and manage vector search infrastructure (e.g., Weaviate, Pinecone, FAISS) to support LLM-enhanced retrieval systems.
Work cross-functionally to productionize data-driven features, signals, and metrics that power both analytics and intelligent experiences.
Contribute to data governance, cataloging, access control, and observability across the ecosystem.
Evaluate and integrate best-in-class tools for embedding generation, document store maintenance, and metadata tracking.
Define and evolve our data lakehouse architecture, balancing batch, real-time, and streaming needs.
Collaborate with the DevOps/MLOps engineer to build reliable, production-ready ML data pipelines that integrate into our broader platform.
Define modeling standards and collaborate closely with Data Science and Product to ensure quality, performance, and usability.
Evaluate and introduce technologies and frameworks that improve scale, efficiency, and maintainability.
Languages: Python, SQL, Bash
Orchestration: Airflow, Dagster, Prefect
Data Governance & Quality: Great Expectations, OpenMetadata, DataHub
LLM Tools: LangChain, Haystack, Hugging Face, OpenAI, Cohere

Must-Have Experience

7+ years of experience in data engineering, platform engineering, or data architecture.
Proven experience designing and implementing enterprise-grade or high-scale data platforms.
Deep fluency in data modeling, data warehousing, and data pipeline orchestration.
Strong command of streaming systems and event-driven data architectures.
A demonstrated ability to scale systems, debug complex data issues, and enforce best practices across a team.
Experience designing scalable data pipelines using orchestration tools and cloud-native data platforms.
Proven ability to build low-latency, real-time and batch ETL/ELT workflows.
Comfort working with unstructured data, including text corpora and document metadata.
Exposure to LLM-adjacent workflows, including fine-tuning, embedding generation, vector similarity search, or context-based retrieval.
Understanding of how to prepare and optimize data for tokenization, chunking, semantic search, and contextual augmentation.

Preferred Experience

Experience operationalizing machine learning pipelines and managing feature engineering workflows.
Familiarity with data privacy, regulatory compliance, or PII governance frameworks.
Exposure to domain-driven design, data mesh, or data product thinking.
Familiarity with data-for-AI patterns, including training set curation, labeling workflows, and long-document management.
Experience with prompt engineering, RAG architectures, or semantic indexing.
Prior experience building data products for ML- and LLM-enabled applications in fast-moving startup environments.

How You Work:

You think strategically and architect for the future — but you can deliver incrementally.
You value pragmatism: you choose the right level of abstraction, not the most complex.
You love enabling other teams — data as a product is how you think.
You’re a strong communicator and collaborator across engineering, product, and data science.
You take pride in building high-trust systems that are observable, resilient, and explainable.
You think in systems and understand how data flows power downstream AI systems, not just dashboards.
You are excited about LLMs, but more excited about making them usable, reliable, and cost-effective in production.
You thrive in fast-paced, collaborative environments and are not afraid to define architecture from the ground up.

As part of our skills-based selection process, candidates may be asked to complete online assessments to help us better understand their fit for the role.

Opportunity to lead and shape the product strategy of a forward-thinking company.
Collaborative and inclusive work environment.
Competitive compensation package and benefits.
Compensation: $170,000 - $185,000. Negotiable based on education, experience, and skills.
Benefits include paid time off and 401(k) plans. medical, dental, and vision insurance are available after an introductory period.

If you have a passion for helping to bring new products to market and a desire to make a real impact on the future of work, we encourage you to apply!

Seniority level

Seniority level
Mid-Senior level

Employment type

Employment type
Full-time

Job function

Job function
Engineering and Information Technology
Industries
Software Development

Referrals increase your chances of interviewing at Ignis AI by 2x

Inferred from the description for this job

Vision insurance

401(k)

Medical insurance

Get notified about new Data Architect jobs in Greater Boston.

Contact Center Genesys Cloud Architect (Remote Opportunity)

Consulting Field Solutions Architect - Unstructured Data

Senior Manager, Clinical Data Visualization Engineer

Boston, MA $137,000.00-$215,270.00 2 days ago

Senior Architect - Commercial Life Sciences (Remote)

Boston, MA $100,000.00-$200,000.00 6 days ago

Associate Director, Media Analytics (Data Scientist)

Boston, MA $110,500.00-$165,800.00 3 days ago

Boston, MA $180,000.00-$215,000.00 2 months ago

Boston, MA $200,000.00-$228,000.00 2 months ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal Platform Architect - Remote

ZipRecruiter

Sacramento

Remote

USD 150,000 - 200,000

2 days ago

Be an early applicant