Enable job alerts via email!

Senior Data Scientist - AI/ML (CADD)

Chemify Ltd

Glasgow

On-site

GBP 65,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A leading chemistry firm in Scotland is seeking a Senior AI/ML Data Scientist to develop machine learning models for computer-aided drug design. The role involves collaboration with scientists and engineers to optimize predictive models and enhance drug discovery processes. Candidates should have a strong background in machine learning, proficiency in Python, and a relevant advanced degree. An exciting opportunity to make a real-world impact in the field of chemistry awaits you.

Qualifications

  • 5+ years of industry or academic experience in relevant fields.
  • Demonstrated proficiency in Python and deep learning frameworks.
  • Theoretical and practical knowledge of modern machine learning architectures.

Responsibilities

  • Design and develop generative models for drug discovery.
  • Architect scalable MLOps pipelines for large datasets.
  • Collaborate with cross-disciplinary teams to define project goals.

Skills

Machine Learning
Deep Learning
Python
Generative Models
Graph Neural Networks
Data Analysis

Education

MSc or PhD in Computer Science or related field

Tools

PyTorch
TensorFlow
Git
Job description
About Chemify

Chemify is revolutionising chemistry. We are creating a future where the synthesis of previously unimaginable molecules, drugs, and materials is instantly accessible. By combining AI, robotics, and the world's largest continually expanding database of chemical programs, we are accelerating chemical discovery to improve quality of life and extend the reach of humanity.

Job Description

We seek a talented and motivated Senior AI/ML Data Scientist to pioneer the development and application of cutting‑edge machine learning models for computer‑aided drug design (CADD) and small molecule discovery.

You will be joining a dynamic, cross‑disciplinary team of computational scientists, medicinal chemists, and engineers. Your primary focus will be on architecting, training, and deploying sophisticated models to predict molecular properties, generate novel compounds, and ultimately accelerate our drug discovery pipelines.

To be successful in this role, you’ll need deep expertise in modern machine learning, particularly generative AI (Transformers, Diffusion Models), Graph Neural Networks, and predictive modeling. You will leverage your skills to tackle complex scientific challenges, working with vast and diverse chemical and biological datasets.

If you are passionate about applying state‑of‑the‑art AI to solve fundamental challenges in chemistry and are driven to see your work make a real‑world impact on discovering new medicines, we’d love to have you join our team.

Key Responsibilities
  • Design, develop, and optimize state‑of‑the‑art generative models (e.g., Transformers, GNNs, Diffusion Models) for robotic‑assisted synthetic route design.
  • Architect and implement scalable MLOps pipelines for preprocessing large‑scale chemical and biological datasets, model training, and rigorous evaluation.
  • Translate cutting‑edge research in AI/ML into practical solutions that address critical challenges such as property prediction (ADMET/QSAR), reaction prediction, and binding affinity prediction.
  • Collaborate closely with computational chemists, medicinal chemists, and software engineers to define project goals, interpret model outputs, and integrate AI‑driven insights into our discovery platform.
  • Design and execute robust experiments to evaluate model performance, focusing on chemical validity, novelty, synthesizability, and predictive accuracy against experimental data.
  • Clearly communicate complex technical concepts, model results, and strategic recommendations to both technical and non‑technical stakeholders.
  • Stay at the forefront of AI for drug discovery, foundation models for science, and multimodal learning, continuously identifying and championing opportunities to enhance our capabilities.
What You’ll Bring
  • MSc or PhD with 5+ years of industry or academic experience in Computer Science, Machine Learning, Computational Chemistry/Biology, or a closely related field.
  • Demonstrated proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow.
  • Deep theoretical and practical knowledge of modern machine learning architectures, including Transformers, Graph Neural Networks (GNNs), and generative models (VAEs, GANs, Diffusion Models) as applied to scientific problems.
  • Proven ability to lead complex AI/ML projects from concept to deployment in a scientific or drug‑discovery context.
  • Extensive experience working with large‑scale molecular datasets (e.g., SMILES, 3D conformations), biological data (e.g., protein sequences, assay data), and other scientific data formats.
  • Experience with efficient model training and fine‑tuning techniques, such as LoRA, quantization, distillation, and model pruning.
  • Strong background or hands‑on experience applying ML to problems involving protein structures, small‑molecule interactions, or related biological data.
  • Familiarity with scalable computing environments, GPU acceleration, and distributed training.
  • Excellent communication and interpersonal skills for effective collaboration in a multidisciplinary team.
  • Collaborative mindset, strong communication skills, and ability to work effectively within a cross‑disciplinary team.
  • Excellent problem‑solving skills and a proactive, can‑do attitude.
  • Eagerness to learn new scientific concepts, computational methods, and software engineering practices from experienced mentors.
  • Good understanding of version control with Git.
Beneficial Skills
  • Hands‑on experience with cheminformatics toolkits such as RDKit.
  • Experience with Retrieval‑Augmented Generation (RAG) systems, including vector databases (e.g., Redis, FAISS, Milvus, Pinecone) for querying large chemical or biological databases.
  • Experience with Protein/DNA language models (e.g., ProtBERT, ESM, Evo) or protein structure prediction models (e.g., AlphaFold‑like approaches).
  • Experience with evaluation frameworks for reaction and synthetic route design, including human‑in‑the‑loop assessment and metrics for novelty, diversity, and feasibility of synthetic pathways.
  • Strong experience with relational and non‑relational databases (SQL/NoSQL), including data modeling and efficient querying for large‑scale AI workflows.
  • A portfolio of projects or open‑source contributions (e.g., a GitHub profile) that demonstrates your skills and passion for AI/ML development.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.