Enable job alerts via email!

Data Engineer

JR United Kingdom

Reading

On-site

GBP 50,000 - 80,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is on the lookout for a proactive Data Engineer to enhance their data systems. This role involves developing and maintaining ETL pipelines using cutting-edge technologies like Databricks and Apache Spark. You will work closely with cross-functional teams to ensure data integrity and optimize workflows, all while adhering to Agile methodologies. If you have a passion for data and a knack for problem-solving, this is an exciting opportunity to make a significant impact in a dynamic environment.

Qualifications

  • Strong expertise in Databricks, Apache Spark, and Delta Lake.
  • Proficiency in Python, SQL, or Scala for data processing.
  • Experience with unit testing for data pipelines.

Responsibilities

  • Develop and maintain ETL pipelines in Databricks using Apache Spark.
  • Collaborate with teams to streamline data access and governance.
  • Ensure data integrity and security standards compliance.

Skills

Databricks
Apache Spark
Delta Lake
Python
SQL
Scala
REST APIs
Agile methodologies
Data privacy principles
Cloud platforms

Tools

GitLab
Azure Databricks
AWS
Google Cloud
Docker
Ansible

Job description

Social network you want to login/join with:

We are seeking a skilled and proactive Data Engineer to join our team and collaborate closely with the Solution Architect to review and enhance our current system. This role involves modifying existing code to implement new features, ensuring data privacy enhancements, and maintaining the reliability and performance of our data systems. The Data Engineer will actively contribute throughout the Agile development lifecycle, participating in planning, refinement, and review ceremonies.

Key Responsibilities:

  • Develop and maintain ETL pipelines in Databricks, leveraging Apache Spark and Delta Lake.
  • Design, implement, and optimize data transformations and treatments for structured and unstructured data.
  • Work with Hive Metastore and Unity Catalog for metadata management and access control.
  • Implement State Store mechanisms for maintaining stateful processing in Spark Structured Streaming.
  • Handle DataFrames efficiently for large-scale data processing and analytics.
  • Schedule, monitor, and troubleshoot Databricks pipelines for automated workflow execution.
  • Enable pause/resume functionality in pipelines based on responses from external API calls.
  • Ensure scalability, reliability, and performance optimization for distributed computing environments.
  • Collaborate with Data Scientists, Analysts, and DevOps teams to streamline data access and governance.
  • Maintain data integrity and security standards in compliance with enterprise data governance policies.
  • Review the existing system architecture and its functionalities in collaboration with the Solution Architect and Lead Data Engineer.
  • Modify and extend existing code to implement new features and improvements.
  • Perform thorough unit testing to verify system functionality and data accuracy.
  • Document all changes made, including technical impact assessments and rationales.
  • Work within GitLab repository structures and adhere to project-specific processes.

Required Skills and Experience:

  • Strong expertise in Databricks, Apache Spark, and Delta Lake.
  • Experience with Hive Metastore and Unity Catalog for data governance.
  • Proficiency in Python, SQL, Scala, or other relevant languages.
  • Familiarity with structured streaming, event-driven architectures, and stateful processing.
  • Ability to design, schedule, and optimize Databricks workflows.
  • Knowledge of REST APIs for integrating external services into pipeline execution.
  • Experience with cloud platforms like Azure Databricks, AWS, or Google Cloud.
  • Experience with Databricks Notebooks for development and testing.
  • Familiarity with AWS S3 for data storage and management.
  • Understanding the platform migrations and dependency requirements, challenges and dependency deliverables.
  • Understanding of data privacy principles and ability to implement privacy-aware solutions.
  • Experience in unit testing for data pipelines or systems.
  • Proficient in version control using GitLab.
  • Solid understanding of Agile methodologies and experience working in Scrum or Kanban environments.

Preferred (Nice to Have):

  • Understanding of data warehousing, lakehouse architectures, and modern data platforms.
  • Strong analytical and problem-solving skills with a focus on automation and efficiency.
  • Knowledge of making and handling API calls.
  • Experience working with Docker containers, Ansible.

Soft Skills:

  • Strong analytical and problem-solving skills.
  • Clear communication and collaboration abilities.
  • Ability to work independently and with cross-functional teams.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Data Engineer

Lorien

Reading

Remote

GBP 40,000 - 70,000

3 days ago
Be an early applicant

Data Engineer - OpenData Commercial

Veeva Systems, Inc.

London

Remote

GBP 50,000 - 75,000

3 days ago
Be an early applicant

Senior Data Engineer

Allen Recruitment Consulting

London

Remote

GBP 60,000 - 90,000

3 days ago
Be an early applicant

Lead Data Engineer

JR United Kingdom

Reading

Remote

GBP 70,000 - 100,000

13 days ago

Senior Azure Data Engineer - UK remote

Guaranteed Tenants Ltd

London

Remote

GBP 45,000 - 56,000

8 days ago

Scala Senior Data Engineer

JR United Kingdom

Watford

Remote

GBP 50,000 - 70,000

10 days ago

Scala Senior Data Engineer

JR United Kingdom

Crawley

Remote

GBP 60,000 - 90,000

10 days ago

Data Engineer - Davies Consulting

Jobs via eFinancialCareers

London

Remote

GBP 45,000 - 65,000

8 days ago

Scala Senior Data Engineer

JR United Kingdom

Oxford

Remote

GBP 50,000 - 80,000

10 days ago