Activez les alertes d’offres d’emploi par e-mail !

Big Data & Cloud Data Engineer

Blackfluo.ai

Paris

Sur place

EUR 60 000 - 80 000

Plein temps

Il y a 30+ jours

Générez un CV personnalisé en quelques minutes

Décrochez un entretien et gagnez plus. En savoir plus

Résumé du poste

A leading technology firm in France is seeking a Big Data & Cloud Data Engineer to design, implement, and manage large-scale data processing systems. You will leverage big data technologies and cloud platforms to enable advanced analytics and real-time data processing. The ideal candidate has over 5 years of experience in big data technologies and strong programming skills in multiple languages. This position offers a dynamic work environment and opportunities for growth.

Qualifications

  • 5+ years of experience with big data technologies.
  • Strong programming skills in Python, Scala, Java, and SQL.
  • Expert knowledge of at least one cloud platform.

Responsabilités

  • Design and implement big data ecosystems using Hadoop.
  • Develop real-time applications using Apache Spark.
  • Configure Apache Kafka for event streaming.

Connaissances

Big data technologies (Hadoop, Spark, Kafka)
Programming skills in Python, Scala, Java, SQL
Cloud platform expertise (Azure, AWS, GCP)
Containerization (Docker, Kubernetes)
Stream processing frameworks
Data modeling and optimization techniques
Data pipeline orchestration

Formation

Bachelor's degree in Computer Science, Data Engineering, or related field

Outils

Terraform
CloudFormation
Description du poste
About the job Big Data & Cloud Data Engineer
Position Overview

We are seeking a Big Data & Cloud Data Engineer to design, implement, and manage large-scale data processing systems using big data technologies (Hadoop, Spark, Kafka) and cloud-based data ecosystems (Azure, GCP, AWS), enabling advanced analytics and real-time data processing capabilities across our enterprise.

Key Responsibilities
  • Design and implement Hadoop ecosystems including HDFS, YARN, and distributed computing frameworks
  • Develop real-time and batch processing applications using Apache Spark (Scala, Python, Java)
  • Configure Apache Kafka for event streaming, data ingestion, and real-time data pipelines
  • Implement data processing workflows using Apache Airflow, Oozie, and workflow orchestration tools
  • Build NoSQL database solutions using HBase, Cassandra, and MongoDB for high-volume data storage
  • Design multi-cloud data architectures using Azure Data Factory, AWS Glue, and Google Cloud Dataflow
  • Implement data lakes and lakehouses using Azure Data Lake, AWS S3, and Google Cloud Storage
  • Configure cloud-native data warehouses including Snowflake, BigQuery, and Azure Synapse Analytics
  • Build serverless data processing solutions using AWS Lambda, Azure Functions, and Google Cloud Functions
  • Implement containerized data applications using Docker, Kubernetes, and cloud container services
  • Develop ETL/ELT pipelines for structured and unstructured data processing
  • Create real-time streaming analytics using Kafka Streams, Apache Storm, and cloud streaming services
  • Implement data quality frameworks, monitoring, and alerting for production data pipelines
  • Build automated data ingestion from various sources including APIs, databases, and file systems
  • Design data partitioning, compression, and optimization strategies for performance
Platform Administration & Optimization
  • Manage cluster provisioning, scaling, and resource optimization across big data platforms
  • Monitor system performance, troubleshoot issues, and implement capacity planning strategies
  • Configure security frameworks including Kerberos, Ranger, and cloud IAM services
  • Implement backup, disaster recovery, and high availability solutions
  • Optimize query performance and implement data governance policies
Required Qualifications

Technical Skills

  • 5+ years experience with big data technologies (Hadoop, Spark, Kafka, Hive, HBase)
  • Strong programming skills in Python, Scala, Java, and SQL for data processing
  • Expert knowledge of at least one major cloud platform (Azure, AWS, GCP) and data services
  • Experience with containerization (Docker, Kubernetes) and infrastructure as code (Terraform, CloudFormation)
  • Proficiency in stream processing frameworks and real-time analytics architectures
  • Knowledge of data modeling, schema design, and database optimization techniques
  • Experience with data pipeline orchestration and workflow management tools
  • Strong understanding of distributed systems, parallel processing, and scalability patterns
  • Knowledge of data formats (Parquet, Avro, ORC) and serialization frameworks
  • Experience with version control, CI/CD pipelines, and DevOps practices for data platforms
Preferred Qualifications
  • Bachelor's degree in Computer Science, Data Engineering, or related field
  • Cloud certifications (Azure Data Engineer, AWS Data Analytics, Google Cloud Data Engineer)
  • Experience with machine learning platforms and MLOps frameworks
  • Background in data governance, data cataloging, and metadata management
  • Knowledge of emerging technologies (Delta Lake, Apache Iceberg, dbt)
Obtenez votre examen gratuit et confidentiel de votre CV.
ou faites glisser et déposez un fichier PDF, DOC, DOCX, ODT ou PAGES jusqu’à 5 Mo.