Ativa os alertas de emprego por e-mail!

Engenheiro(a) de Dados Pleno/Sênior – Especialista em AWS Glue & Spark (PySpark)

IBM Computing

Brasília

Presencial

USD 70.000 - 90.000

Tempo integral

Há 2 dias
Torna-te num dos primeiros candidatos

Resumo da oferta

A global technology consulting firm is seeking a Data Engineer specializing in the AWS ecosystem in Brasília, Brazil. The role involves designing, building, and maintaining scalable data pipelines and ensuring data quality and governance. Candidates should have strong experience with AWS Glue, Spark, Python, and Apache Airflow. This position offers a collaborative environment with opportunities for career growth and development.

Qualificações

  • Proven experience with AWS Glue, Spark, PySpark, and Python.
  • Knowledge of Data Mesh and distributed data architecture.
  • Experience with batch and streaming data pipelines (Kinesis, Kafka).
  • Analytical and detail-oriented problem-solving skills.

Responsabilidades

  • Design and maintain ETL pipelines using AWS Glue and EMR.
  • Build real-time data pipelines integrating multiple sources.
  • Migrate data using AWS DMS and manage Glue Catalog.
  • Develop dashboards using AWS Quicksight.

Conhecimentos

AWS Glue
Spark (PySpark)
Python
Data Mesh principles
SQL
Apache Airflow
Kinesis
DynamoDB

Ferramentas

AWS Quicksight
RedShift
Aurora
Glue Catalog
Glue Databrew

Descrição da oferta de emprego

Introduction

A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe.

You'll work with visionaries across multiple industries to improve the hybrid cloud and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio; including Software and Red Hat.

Curiosity and a constant quest for knowledge serve as the foundation to success in IBM Consulting. In your role, you'll be encouraged to challenge the norm, investigate ideas outside of your role, and come up with creative solutions resulting in ground breaking impact for a wide network of clients. Our culture of evolution and empathy centers on long-term career growth and development opportunities in an environment that embraces your unique skills and experience.

Your role and responsibilities

The Data Engineer with a specialization in the AWS ecosystem works advising, developing and maintaining data engineering solutions in the AWS cloud. They are responsible for designing, building and operating batch and real-time data pipelines using services such as AWS Glue, AWS EMR, Glue Catalog and Kinesis.

They also create data layers in RedShift, Aurora and DynamoDB, as well as performing migrations with AWS DMS. Mastery of the main components of the AWS Data Platform is essential, including S3, RedShift Spectrum, AWS Glue with Spark and Python, Lambda Functions, Glue Catalog and Glue Databrew. Experience in data pipelines for Data Warehouse and Data Lake, using Kinesis, Managed Streaming for Apache Kafka, Apache Airflow and dbt, as well as Spark/Python or Spark/Scala on AWS, is highly valued.

The Engineer schedules and manages data services on AWS, ensuring flawless integration and operation of data engineering solutions.

Main Responsibilities:

  • Design, develop and maintain scalable ETL pipelines using AWS Glue, EMR, Spark (PySpark) and Python.

  • Build and operate batch and real-time data pipelines, integrating multiple sources and destinations (S3, RedShift, Aurora, DynamoDB).

  • Implement and optimize workflows and dataframes for high performance and reliability.

  • Migrate data using AWS DMS and manage data catalogs with Glue Catalog.

  • Develop dashboards and visualizations with AWS Quicksight.

  • Apply Data Mesh principles for distributed data architecture and governance.

  • Use open source tools such as Apache Airflow, dbt, Spark/Scala.

  • Ensuring data quality, security and governance throughout the lifecycle.

  • Collaborate with multidisciplinary teams to deliver solutions that support advanced analytics and business intelligence.

Required technical and professional expertise

'- Proven and practical experience with AWS Glue, Spark (conceptual), PySpark and Python.

  • Knowledge of Data Mesh and distributed data architecture.

  • Experience with AWS Quicksight, S3, RedShift, Aurora, DynamoDB, Glue Catalog, Glue Databrew, Lambda Functions.

  • Experience with batch and streaming data pipelines (Kinesis, Kafka).

  • Mastery of SQL and dataframe manipulation.

  • Experience with tools such as Apache Airflow and dbt.

  • Ability to work in an agile, collaborative and multicultural environment.

  • Analytical, detail-oriented and problem-solving profile.

IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.

Obtém a tua avaliação gratuita e confidencial do currículo.
ou arrasta um ficheiro em formato PDF, DOC, DOCX, ODT ou PAGES até 5 MB.