Enable job alerts via email!

Graduate Data Engineer

SRG

England

Hybrid

GBP 30,000 - 45,000

Full time

2 days ago

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A prominent pharmaceutical company in Marlow is seeking a Graduate Data Engineer to build and maintain data pipelines for advanced reporting and analytics. You will collaborate with cross-functional teams, ensuring access to reliable data while learning from experienced colleagues. The ideal candidate will have a degree in a relevant field and familiarity with Python, PySpark, and analytics tools. This role offers a hybrid working model, requiring in-office attendance at least three days a week starting January 2026.

Qualifications

Degree in a relevant field or similar work experience.
Up to 2 years of experience building data pipelines.
Clear and reliable coding skills in Python/PySpark.

Responsibilities

Build and maintain data pipelines using PySpark and TypeScript.
Prepare and optimize data pipelines for machine learning support.
Gather and translate stakeholder requirements for data models.

Skills

Python

PySpark

Data visualization tools

Analytics tools

Cloud services

DevOps methodologies

Education

Degree in Computer Science, Engineering, Mathematics, or similar

Tools

Palantir Foundry

Kafka

PowerBI

Role Title: Graduate Data Engineer
Contract: 12 months
Location: Marlow (hybrid)

SRG are working with a leading pharmaceutical company based in Marlow. Our client develops and manufacture an impressive portfolio of aesthetics brands and products. Our client is committed to driving innovation and providing high-quality products and services.

Role Overview

As a Graduate Data Engineer, you will build and maintain scalable data pipelines in Palantir Foundry for advanced reporting and analytics while collaborating with cross-functional teams as part of the BTS Data & Analytics team. You will work closely with key stakeholders in Engineering, Product, GTM, and other groups to help build scalable data solutions that support key metrics, reporting, and insights. You will assist in ensuring teams have access to reliable, accurate data as our company grows. You will have the opportunity to support projects that enable self-serve insights, helping teams make data-driven decisions, while learning from experienced team members and developing your technical and business skills.

Key Responsibilities

Build and maintain data pipelines, leveraging PySpark and/or Typescript within Foundry, to transform raw data into reliable, usable datasets. Familiarity with Palantir Foundry, PySpark, Kafka, TypeScript, PowerBI preferable.
Assist in preparing and optimizing data pipelines to support machine learning and AI model development, ensuring datasets are clean, well-structured, and readily usable by Data Science teams.
Support the integration and management of feature engineering processes and model outputs into Foundry's data ecosystem, helping enable scalable deployment and monitoring of AI/ML solutions as you develop your skills in this area.
Engaged in gathering and translating stakeholder requirements for key data models and reporting, with a focus on Palantir Foundry workflows and tools.
Participate in developing and refining dashboards and reports in Foundry to visualize key metrics and insights as you grow your data visualization skills.
Collaborate with Product, Engineering, and GTM teams to align data architecture and solutions, learning to support scalable, self-serve analytics across the organization.
Have some prompt engineering experience with large language models, including writing and evaluating complex multi-step prompts
Continuously develop your understanding of the company's data landscape, including Palantir Foundry's ontology-driven approach and best practices for data management.

About you

You have a degree in Computer Science, Engineering, Mathematics, or similar, or have similar work experience.
Having up to 2 years of experience building data pipelines at work or through internships is helpful.
You can write clear and reliable Python/PySpark code.
You are familiar with popular analytics tools (like pandas, numpy, matplotlib), big data frameworks (like Spark), and cloud services (like Palantir, AWS, Azure, or Google Cloud).
You have a deep understanding of data models, relational and non-relational databases, and how they are used to organize, store, and retrieve data efficiently for analytics and machine learning.
Knowing about software engineering methods, including DevOps, DataOps, or MLOps, is also a plus.

You will be considered a strong fit if you have

Master's degree in engineering (such as AI/ML, Data Systems, Computer Science, Mathematics, Biotechnology, Physics), or minimum 2 years of relevant technology experience.
Experience with Generative AI (GenAI) and agentic systems will be considered a strong plus.
Have a proactive and adaptable mindset: willing to take initiative, learn new skills, and contribute to different aspects of a project as needed to drive solutions from start to finish, even beyond the formal job description.
Show a strong ability to thrive in situations of ambiguity, taking initiative to create clarity for yourself and the team, and proactively driving progress even when details are uncertain or evolving.

Other details

Hybrid working policy: Currently, our client expects all staff to be in their Marlow-based office at least 3 days a week from Jan 2026.
No visa sponsorship. ILR/Citizenship required.

Guidant, Carbon60, Lorien & SRG - The Impellam Group Portfolio are acting as an Employment Business in relation to this vacancy.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities

Top companies

Popular jobs