Overview
We are looking for a data engineer experienced in DevOps-based pipeline delivery, who can not only develop the pipeline but also establish the foundational framework for reusable data ingestion processes. The ideal candidate is proactive, a self-starter, and demonstrates a strong can-do attitude.
While not essential, experience with Health Data systems would be highly advantageous.
Responsibilities
- Ingestion Framework Delivery: Responsible for building reusable metadata driven data pipelines within a framework to handle batch and near-real-time data feeds.
- Data Pipeline Development: Develop end-to-end data pipelines, including data load patterns, error handling, automation, and hardware optimisation.
- Requirements Formulation: Collaborate with Business Analysts, Architects, SMEs, and business teams to define requirements and implement solutions using modern EDW cloud tools and best practices.
- Detailed Solution Design: Work with architects and analysts to create detailed solution designs for data pipelines, ensuring adherence to policies, rules, and standards.
- Promote DevOps best practices for iterative solution delivery including CI/CD, version control, monitoring and alerting, automated testing and IaC.
- Data Modelling and Warehousing: Build and optimise pipelines to populate data stores such as DWH, Lakehouse, and other repositories, following industry and clinical standards like openEHR, FHIR and OMOP.
- Data Quality & Governance: Apply data quality checks, validation rules, and governance policies to maintain the accuracy, completeness, and reliability of clinical data, and address any data discrepancies.
- Data Integration: Integrate various clinical datasets, ensuring proper mapping, standardisation, and harmonisation across systems and terminologies (e.g., SNOMED CT, LOINC, ICD-10).
- Performance Optimisation: Monitor and enhance the performance of data pipelines, warehouses, and queries for efficient data processing.
- Operational Controls: Apply operational procedures, security practices, and production policies to ensure high-quality service delivery.
- Collaboration: Work with clinical stakeholders, data scientists, analysts, and other professionals to define data requirements and deliver technical solutions. Lead showcase sessions after each delivery.
- Documentation: Maintain comprehensive technical documentation for data architectures, pipelines, models, metadata, and processes.
- Troubleshooting & Support: Provide technical support and resolve issues related to data pipelines and data quality.
- Innovation & Best Practices: Stay updated on new data engineering technologies and best practices, especially in healthcare, and recommend adoption as needed.
- Lead proof of concepts, pilots, and develop data pipeline using agile and iterative methods.
Qualifications
- Certifications such as DP 203 and AZ 900 or similar certification/experience.
Essential skills
- Experience in working with healthcare data to build a healthcare data store would be a significant plus. Which should include standards and interoperability protocols (e.g., openEHR, FHIR, HL7, DICOM, CDA).
- Experience in converting one healthcare data to develop an ODS or DWH.
- Experience in integrating data analytical outcomes and key information into clinical workflows.
Desired skills
- Knowledge of streaming data architectures and technologies (e.g., Kafka, Azure Event Hubs, Kinesis).
- Knowledge of handing Genome datasets (FASTQ, VCF etc.) and document formats (IHE).
- General experience working with Gen AI including LLM generated clinical data/summaries
Experience
- Extensive background as a data engineer, specialising in data warehouse environments and building various types of data pipelines.
- Demonstrated ability to design and implement data integration and conversion pipelines using ETL/ELT tools, accelerators, and frameworks such as Azure Data Factory, Azure Synapse, Snowflake (Cloud), SSIS, or custom scripts.
- Skilled in developing reusable ETL frameworks for data processing.
- Proficient in at least one programming language commonly used for data manipulation and scripting, including Python, PySpark, Java, or Scala.
- Strong understanding and hands-on experience with DevOps practices and tools, especially Azure DevOps for CI/CD, Git for version control, and Infrastructure as Code.
- Advanced SQL skills and experience working with relational databases like PostgreSQL, SQL Server, Oracle, and MySQL.
- Experience implementing solutions on cloud-based data platforms such as Azure, Snowflake, Google Cloud, and related accelerators.
- Experience with developing and deploying containerised microservices architectures.
- Understanding of data modelling techniques, including star schema, snowflake schema, and third normal form (3NF).
- Proven track record with DevOps methodologies in data engineering, including CI/CD and Infrastructure as Code.
- Knowledge of data governance, data quality, and data security in regulated environments.
- Experience mapping data from unstructured, semi-structured, and proprietary structured formats within clinical data stores.
- Strong interpersonal, communication, and documentation abilities, enabling effective collaboration between clinical and technical teams.
- Experience working in Agile development settings.
- Outstanding analytical, problem-solving, and communication skills.
- Ability to work independently as well as collaboratively within a team.
Benefits
- Collaborative working environment – we stand shoulder to shoulder with our clients and ourpeers through good times and challenges
- We empower all passionate technology loving professionals by allowing them to expand their skills and take part in inspiring projects
- ExpleoAcademy - enables you to acquire and develop the right skills by delivering a suite of accredited training courses
- Competitive company benefits
- Always working as one team, our people are not afraid to think big and challenge the status quo
- As a Disability Confident Committed Employer we have committed to:
- Ensure our recruitment process is inclusive and accessible
- Communicating and promoting vacancies
- Offering an interview to disabled people who meet the minimum criteria for the job
- Anticipating and providing reasonable adjustments as required
- Supporting any existing employee who acquires a disability or long term health condition, enabling them to stay in work at least one activity that will make a difference for disabled people
“We are an equal opportunities employer and welcome applications from all suitably qualified persons regardless of their race, sex, disability, religion/belief, sexual orientation or age”.
We treat everyone fairly and equitably across the organisation, including providing any additional support and adjustments needed for everyone to thrive
#LI-BM1
#LI-DS1