Enable job alerts via email!

Lead Data Engineer - Remote / Telecommute

Cynet Systems Inc

Minneapolis (MN)

Remote

USD 80,000 - 120,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in Minneapolis is seeking a Data Engineer proficient in PySpark and SQL to develop and maintain ETL/ELT pipelines. The role involves working with large-scale datasets and collaborating with cross-functional teams to ensure data quality and performance optimization.

Qualifications

  • Strong experience in Python for data engineering tasks.
  • Proficiency in PySpark for large-scale data processing.
  • Experience with cloud data services and orchestration tools.

Responsibilities

  • Develop, optimize, and maintain ETL/ELT pipelines using PySpark and SQL.
  • Collaborate with Data Scientists and Analysts to integrate data workflows.
  • Ensure data quality, validation, and consistency in data pipelines.

Skills

Python
PySpark
SQL
Data Quality
Data Engineering

Tools

AWS Glue
Databricks
Azure Synapse
GCP BigQuery
Airflow
Apache Oozie
Snowflake
Redshift
Git

Job description

Job Description:

Responsibilities:
  • Develop, optimize, and maintain ETL/ELT pipelines using PySpark and SQL.
  • Work with structured and unstructured data to build scalable data solutions.
  • Write efficient and scalable PySpark scripts for data transformation and processing.
  • Optimize SQL queries, stored procedures, and indexing strategies to enhance performance.
  • Design and implement data models, schemas, and partitioning strategies for large-scale datasets.
  • Collaborate with Data Scientists, Analysts, and other Engineers to integrate data workflows.
  • Ensure data quality, validation, and consistency in data pipelines.
  • Implement error handling, logging, and monitoring for data pipelines.
  • Work with cloud platforms (AWS, Azure, or GCP) for data processing and storage.
  • Optimize data pipelines for cost efficiency and performance.
Technical Skills Required:
  • Strong experience in Python for data engineering tasks.
  • Proficiency in PySpark for large-scale data processing.
  • Deep understanding of SQL (Joins, Window Functions, CTEs, Query Optimization).
  • Experience in ETL/ELT development using Spark and SQL.
  • Experience with cloud data services (AWS Glue, Databricks, Azure Synapse, GCP BigQuery).
  • Familiarity with orchestration tools (Airflow, Apache Oozie).
  • Experience with data warehousing (Snowflake, Redshift, BigQuery).
  • Understanding of performance tuning in PySpark and SQL.
  • Familiarity with version control (Git) and CI/CD pipelines.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.