Activez les alertes d’offres d’emploi par e-mail !

Data Platform Engineer (Coding & ETL Tooling)

Blackfluo.ai

Paris

Sur place

EUR 55 000 - 75 000

Plein temps

Il y a 7 jours
Soyez parmi les premiers à postuler

Résumé du poste

A leading data solutions company is seeking a Data Platform Engineer in Paris. The candidate will manage data workflows and develop automation using Python and ETL tools like Airflow and dbt. Required skills include strong Git proficiency, 6+ years in data programming, and experience with CI/CD pipelines. Join us to leverage your experience in a collaborative environment.

Qualifications

  • 6+ years of experience in Python, R, and Stata.
  • Strong proficiency in Git-based workflows.
  • Experience with IDEs like VSCode and Jupyter.
  • Track record in implementing ETL/ELT tools.
  • Knowledge of DevOps practices for workflows.

Responsabilités

  • Set up and maintain development environments.
  • Administer Git-based version control systems.
  • Design data transformation pipelines.
  • Develop CI/CD pipelines for analytics.
  • Build reproducible workflows for data science.

Connaissances

Data scripting and statistical programming in Python
Experience with Git-based workflows
Configuring IDEs (VSCode, RStudio, Jupyter)
Implementing open-source ETL/ELT tools
DevOps and CI/CD automation

Formation

Bachelors or Masters degree in Data Engineering, Computer Science, Statistics

Outils

GitLab
GitHub Actions
Airflow
dbt
Docker

Description du poste

About the job Data Platform Engineer (Coding & ETL Tooling)
Position Overview
We are seeking a

Data Platform Engineer (Coding & ETL Tooling)
Position Overview
We are seeking a Data Platform Engineer with a strong background in modern coding environments and open-source ETL/ELT technologies. The successful candidate will support the development, orchestration, and automation of data workflows using tools like Python, R, GitLab Runners, Airflow, and dbt. This role also involves managing and optimizing collaborative development environments (GitHub, GitLab) and supporting IDE usage across data science and engineering teams.
Key Responsibilities
Coding Environment Management
  • Support the setup and maintenance of development environments using IDEs such as VSCode, RStudio, Cursor, and Jupyter
  • Enable best practices for collaborative coding in languages such as Python, R, and Stata
  • Ensure integration between IDEs, data platforms, and source control tools for streamlined workflows
  • Assist in optimizing development environments for reproducibility, package management, and dependency tracking
Source Control & CI/CD
  • Administer Git-based version control systems (GitHub, GitLab), including branching strategies, access control, and repo management
  • Develop and manage CI/CD pipelines using GitLab Runners and GitHub Actions for data pipelines and analytical code
  • Promote code quality through automated testing, linting, and review workflows
  • Support onboarding and upskilling of users in Git workflows and coding standards
ETL/ELT Tooling & Orchestration
  • Design and implement data transformation pipelines using open-source tools like Apache Airflow, dbt, and VTL (Validation and Transformation Language)
  • Maintain orchestration workflows and monitor execution of scheduled jobs
  • Optimize task dependencies, retries, and performance within Airflow DAGs and dbt models
  • Integrate ETL tools with source systems, metadata layers, and data warehouses
Automation & Reproducibility
  • Build reproducible workflows for data science, statistical analysis, and reporting using templated code bases and configuration-driven pipelines
  • Develop modular, reusable components for data ingestion, cleaning, validation, and transformation
  • Create infrastructure-as-code templates for deploying ETL tools in cloud or on-prem environments
  • Support interoperability and standardization across analytics and data engineering teams
Required Qualifications
Technical Skills
  • 6+ years of experience with data scripting and statistical programming languages (Python, R, Stata)
  • Strong proficiency with Git-based workflows and tools (GitLab, GitHub, GitHub Actions)
  • Experience configuring and working within IDEs such as VSCode, RStudio, Jupyter, and/or Cursor
  • Proven track record implementing and managing open-source ETL/ELT tools (Airflow, dbt, GitLab Runners, VTL)
  • Familiarity with data orchestration, testing, and observability for pipelines
DevOps & Workflow Automation
  • Experience developing CI/CD pipelines for analytical and data engineering use cases
  • Knowledge of containerization (Docker) and task execution environments (Kubernetes, GitLab Runners)
  • Scripting expertise (Bash, Python, YAML) for configuration, automation, and job orchestration
  • Understanding of software engineering best practices (modular design, unit testing, reproducibility)
Preferred Qualifications
  • Bachelors or Masters degree in Data Engineering, Computer Science, Statistics, or related field
  • Experience working in collaborative research or analytics teams with reproducible coding standards
  • Knowledge of data validation frameworks (e.g., Great Expectations, VTL), metadata integration, and lineage tracking
  • Familiarity with cloud-native infrastructure and deployment (AWS, GCP, Azure)
  • Contributions to or experience working with open-source ETL/analytics tooling

Obtenez votre examen gratuit et confidentiel de votre CV.
ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.