Enable job alerts via email!

Principal Data Engineer

Atorus Research

London

Remote

GBP 60,000 - 90,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Start fresh or import an existing resume

Job summary

A leading company in the research field is seeking a Principal Data Engineer to support complex data engineering projects. The role focuses on designing data pipelines, implementing AWS technologies, and ensuring data quality across various R&D initiatives. The ideal candidate will hold a relevant degree and possess strong experience working with data in a healthcare context. This full-time position offers remote flexibility across the UK.

Qualifications

  • 3-5 years of experience in data engineering, focusing on healthcare or clinical-related data.
  • Strong knowledge of unstructured database technologies.
  • Experience in an Agile development environment.

Responsibilities

  • Design and maintain data pipelines for R&D data.
  • Develop data quality frameworks and validation processes.
  • Implement modern software development best practices.

Skills

Python
R
SQL
AWS S3
AWS Redshift
Data Modeling
ETL

Education

Bachelor's Degree in Computer Science
Master's Degree in relevant field

Tools

AWS Glue
Docker
Kubernetes

Job description

Principal Data Engineer
full-time
remote from anywhere in the UK #LI-Remote

Description:
The Principal will be responsible for supporting complex or leading singular projects related to data engineering requirements and initiatives across Research and Development. The Principal will support data projects from across the business including Clinical, Pre-Clinical, Non-Clinical, Chemistry, RWD and Omics.

Essential Functions:
• Support the design, development and maintenance of data pipelines for processing Research and Development data from diverse sources (Clinical Trials, Medical Devices, Pre-Clinical, Omics, Real World Data) utilizing the AWS technology platform.
• Create and optimize ETL/ELT processes for structured and unstructured data using Python, R, SQL, AWS services and other tools.
• Build and maintain data repositories using AWS S3 and FSx technologies. Establish data warehousing solutions using Amazon Redshift.
• Build and maintain standard data models.
• Develop data quality frameworks, validation processes and KPIs to ensure accuracy and consistency of data pipelines.
• Implement data versioning and lineage tracking to support data traceability, regulatory compliance and audit requirements.
• Create and maintain documentation for data processes, architectures, and workflows.
• Implement modern software development best practices (e.g. Code Versioning, DevOps, CD/CI).
• Maintain compliance with data privacy regulations such as HIPAA, GDP
• May be required to develop, deliver or support data literacy training across R&D.

Required Knowledge, Skills and Abilities:
• Strong knowledge of data engineering tools such as Python, R and SQL for data processing.
• Strong proficiency with AWS services particularly S3, Redshift, FSx, Glue, Lambda.
• Strong proficiency with relational databases.
• Strong background in data modeling and database design.
• Familiarity with unstructured database technologies (e.g. NoSQL) and other database types (e.g. Graph).
• Familiarity with Containerization such as Docker and EKS/Kubernetes.
• Familiarity with one or more RnD research process and associated regulatory requirements.
• Exposure to healthcare data standards (CDISC, HL7, FHIR, SNOMED CT, OMOP, DICOM).
• Exposure to big data technologies and handling.
• Knowledge of machine learning operations (MLOps) and model deployment.
• Strong problem-solving and analytical abilities.
• Excellent communication and collaboration skills.
• Experience working in an Agile development environment.

Minimum Requirements:
• Bachelor's Degree in Computer Science, Statistics, Mathematics, Life Sciences, or other relevant scientific fields; Master's Degree preferred
• 3-5 years of experience in data engineering, with at least 1.5 years focusing on healthcare, research or clinical related data
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.