ABS is seeking an exceptional Data Engineer to join us full-time on our Artificial Intelligence (AI) Practice Team. In this role, you will design and operate the data foundations that power AI chat assistants, custom AI models, and AI-driven process optimization for ABS Consulting clients. You will build robust pipelines that integrate structured and unstructured data, standardize and tag enterprise content, and enable scalable, low-latency retrieval for AI workloads. Working closely with AI engineers, consultants, and domain experts, you will turn messy real-world data into production-grade data assets that deliver measurable business impact.
What You Will Do
- Design, build, and maintain scalable ETL/ELT pipelines to ingest, clean, and transform structured and unstructured data for AI assistants and custom models.
- Integrate diverse knowledge repositories (documents, policies, procedures, standards, databases) into centralized data platforms that support retrieval-augmented generation (RAG) and search.
- Implement data standardization, normalization, and tagging pipelines to align content with enterprise taxonomies and ontologies.
- Collaborate with AI/ML engineers to productionize model‑ready datasets, feature stores, and embeddings for prediction, classification, and optimization use cases.
- Optimize data workflows for reliability, cost, and performance across batch and streaming workloads, including monitoring, alerting, and capacity planning.
- Establish and enforce data quality, lineage, and governance practices to ensure trustworthy inputs to AI systems and process‑automation solutions.
- Automate and templatize common data engineering patterns to accelerate delivery across multiple client engagements and industry domains.
- Partner with consultants and business stakeholders to translate process optimization and analytics requirements into robust, maintainable data solutions.
What You Will Need
Education and Experience
- Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a closely related technical field; Master’s degree preferred.
- 6+ years of professional data engineering experience designing, building, and operating production data solutions.
- Demonstrated experience working in data‑intensive environments (e.g., analytics platforms, AI/ML workloads, large‑scale content repositories, or enterprise data platforms).
- Hands‑on experience delivering solutions on at least one major cloud provider (AWS, Azure, or Google Cloud), including managed data and analytics services.
Knowledge, Skills, and Abilities
- Strong command of SQL and at least one programming language commonly used in data engineering (Python preferred) for building production‑grade data pipelines.
- Hands‑on experience with modern data processing frameworks and platforms (e.g., Spark, Databricks, Snowflake, BigQuery, Synapse, or similar).
- Proficiency with ETL/ELT orchestration tools and workflows (e.g., Airflow, dbt, Azure Data Factory, AWS Glue, or equivalent).
- Experience designing and operating data lakes/lakehouses and integrating multiple data sources (relational, NoSQL, files, APIs) into cohesive data models.
- Deep experience working with unstructured and semi‑structured data (documents, PDFs, JSON, logs), including content extraction, normalization, and metadata/tagging.
- Familiarity with AI/ML data patterns, including feature engineering, embeddings, vector databases, and retrieval‑augmented generation (RAG) pipelines.
- Strong understanding of data modeling, data quality, data governance, and lineage practices for regulated or compliance‑sensitive environments.
- Proficiency with cloud‑native data services (e.g., S3/ADLS/GCS, managed warehouses, streaming services like Kafka/Kinesis/Event Hubs).
- Solid grounding in software engineering best practices (version control, CI/CD, testing, code review) as applied to data engineering.
- Must hold a valid right to work status in the UK.
Reporting Relationships
This role reports to a project manager and does not initially include direct reports.