3+ years of hands-on experience as a Data Engineer working with Databricks and Apache Spark
Strong programming skills in Python, with experience in data manipulation libraries (e.g., PySpark, Spark SQL)
Experience with core components of the Databricks ecosystem: Databricks Workflows, Unity Catalog, and Delta Live Tables
Solid understanding of data warehousing principles, ETL/ELT processes, data modeling and techniques, and database systems
Proven experience with at least one major cloud platform (Azure, AWS, or GCP)
Excellent SQL skills for data querying, transformation, and analysis
Excellent communication and collaboration skills in English and German (min. B2 levels)
Ability to work independently as well as part of a team in an agile environment
Responsibilities:
Designing, developing, and maintaining robust data pipelines using Databricks, Spark, and Python
Building efficient and scalable ETL processes to ingest, transform, and load data from various sources (databases, APIs, streaming platforms) into cloud-based data lakes and warehouses
Leveraging the Databricks ecosystem (SQL, Delta Lake, Workflows, Unity Catalog) to deliver reliable and performant data workflows
Integrating with cloud services such as Azure, AWS, or GCP to enable secure, cost-effective data solutions
Contributing to data modeling and architecture decisions to ensure consistency, accessibility, and long-term maintainability of the data landscape
Ensuring data quality through validation processes and adherence to data governance policies
Collaborating with data scientists and analysts to understand data needs and deliver actionable solutions
Staying up to date with advancements in Databricks, data engineering, and cloud technologies to continuously improve tools and approaches