Job Search and Career Advice Platform

¡Activa las notificaciones laborales por email!

Lead Data Software Engineer (Spark/Scala/Databricks)

EPAM Systems

A distancia

EUR 55.000 - 80.000

Jornada completa

Ayer
Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A technology solutions company is seeking a Lead Data Software Engineer to transform data pipelines from Oracle to Databricks. The role requires expertise in Spark and Scala to deliver curated data solutions in a dynamic environment. Candidates should have good experience with cloud tech including AWS, Databricks, and CI/CD processes. This position offers remote work options across Spain and various employee benefits including health insurance and professional certification opportunities.

Servicios

Private health insurance
EPAM Employees Stock Purchase Plan
100% paid sick leave
Referral Program
Professional certification
Language courses

Formación

  • Expertise in Apache Spark with Spark streaming.
  • Good hands-on experience with Databricks and delta-lake.
  • Fluency in the Scala programming language.
  • Expertise in SQL.

Responsabilidades

  • Develop, monitor, and operate critical curated data pipelines.
  • Consult with analysts and data scientists to improve KPIs.
  • Redevelop legacy pipelines into scalable versions.
  • Leverage and improve a cloud-based tech stack.

Conocimientos

Apache Spark expertise
Scala fluency
SQL expertise
CI/CD experience
AWS knowledge
Airflow pipelines

Herramientas

Databricks
GitHub
Kubernetes
Airflow
Descripción del empleo

Lead Data Software Engineer (Spark / Scala / Databricks)

Join EPAM Systems as a Lead Data Engineer to transform data pipelines from Oracle to Databricks. As a forward‑thinking Scala expert, you will deliver critical curated data solutions across a dynamic, agile, client‑focused environment based in Madrid, Málaga, or remotely across Spain.

Responsibilities
  • Develop, monitor, and operate the most critical curated data pipeline at the client – Sales Order Data (incl. post‑order information such as shipment, return, payment).
  • Consult with analysts, data scientists, and product managers to build and continuously improve a “Single Source of Truth” KPI for business steering such as the central Profit Contribution measurement (PC II).
  • Redevelop legacy pipelines into new, advanced, and standard versions that are easy to maintain and scalable for future demands.
  • Leverage and improve a cloud‑based tech stack that includes AWS, Databricks, Kubernetes, Spark, Airflow, Python, and Scala.
Requirements
  • Expertise in Apache Spark along with Spark streaming.
  • Good hands‑on experience with Databricks and delta‑lake.
  • Fluency in the Scala programming language.
  • Expertise in SQL.
  • Good understanding and hands‑on experience with CI / CD.
  • Rich working experience with GitHub.
  • Fluency working with the AWS landscape.
  • Ability to build Apache Airflow pipelines.
Nice to Have
  • Presto
  • Superset
  • Starburst
  • Oracle & Exasol
We Offer
  • Private health insurance
  • EPAM Employees Stock Purchase Plan
  • 100% paid sick leave
  • Referral Program
  • Professional certification
  • Language courses
Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.