Enable job alerts via email!

Senior Scientific Data Engineer

Chemify Ltd

Glasgow

On-site

GBP 80,000 - 100,000

Full time

Today

Be an early applicant

Job summary

A leading chemical technology firm in Glasgow is seeking a Senior / Lead Scientific Data Engineer to develop scalable data systems for chemistry and drug discovery. The role requires strong programming skills in Python and proficiency in SQL/PostgreSQL, along with a background in data engineering, to solve complex technical challenges. Join a dynamic team to drive innovation in the field of chemical research.

Qualifications

5+ years commercial Data Engineering experience, ideally in Life Sciences or AI Drug discovery.
Strong programming skills in Python with experience in data-intensive applications.
Deep understanding of ETL concepts and experience building production-grade pipelines.

Responsibilities

Develop scalable data models and workflows in AI chemistry synthesis and manufacturing.
Lead the architecture and optimization of the data warehouse and access layers.
Design robust data pipelines for various data sets.

Skills

SQL/PostgreSQL

Python

Data Engineering

ETL concepts

Data modeling

Cloud-based data services

Mentoring

Collaboration

Education

BSc in a scientific discipline

Tools

Argo Workflows

BI tools

AWS/GCP/Azure

Terraform

Claude Code

Cursor

About Chemify

Chemify is revolutionising chemistry. We are creating a future where the synthesis of previously unimaginable molecules, drugs, and materials is instantly accessible. By combining AI, robotics, and the world's largest continually expanding database of chemical programs, we are accelerating chemical discovery to improve quality of life and extend the reach of humanity.

Job Description

We are seeking a Senior / Lead Scientific Data Engineer to lead the development of scalable, reliable data systems for scientific and experimental data.

You wlll architect and maintain pipelines that ingest, clean, and serve chemistry and drug discovery datasets, ensuring high performance and reproducibility. This role requires a strong foundation in Python, PostgreSQL, and modern data engineering practices and a keen interest in working in a cross-functional environment spanning software, chemistry, operations and program management.

If you enjoy problem solving complex technical challenges that make a real-world impact, are a natural communicator and are energized by working closely with scientists using cutting edge technologies, then we\'d love to welcome you to our team.

Key Responsibilities

Develop scalable data models and workflows covering a wide range of use cases in AI chemistry synthesis and manufacturing.
Lead on the architecture and optimisation of our data warehouse and data access layers, enabling our analytics team to rapidly deliver key operational insights.
Design, implement, and maintain robust data pipelines for a wide range of internal, client specific and literature-based data sets.
Develop scalable frameworks for data wrangling, transformation, and validation.
Define and champion data engineering best practices (versioning, testing, documentation, governance).
Enable feature teams with expertise on domain modelling and query optimisation.
Mentor junior colleagues, providing guidance on technical challenges.
Contribute to team-wide initiatives, including code reviews, design discussions, process improvements and workstream planning.

What you\'ll bring

BSc in scientific discipline.
5+ years commercial Data Engineering experience, preferably within a Life Sciences or AI Drug discovery context.
Expertise in SQL/PostgreSQL (schema design, query optimisation, indexing, partitioning).
Experience enabling analytics teams by building data models for BI tools and dashboards
Strong programming skills in Python, with experience in building and maintaining data-intensive applications.
Deep understanding of ETL concepts and building production-grade pipelines.
Experience orchestrating workflows and pipelines with Argo Workflows, Prefect or similar.
Familiarity with cloud-based data services (AWS/GCP/Azure).
Experience using AI-assisted coding and development tools (e.g. Claude Code, Cursor) as part of modern best practices.
Strong communication skills and a collaborative approach to mentoring and teamwork

Beneficial Skills

Interest in chemistry, manufacturing, or robotics.
Hands-on experience with scientific or drug discovery data (chemical, biological, or lab data).
Interest in semantic technologies (Graph Databases, Ontologies).
Exposure to ML engineering or high performance compute environments.
Experience with Infrastructure as Code such as Terraform and AWS CDK.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.