Enable job alerts via email!

Intermediate - Senior Data Engineer (Databricks)

Recooty Inc.

United States

On-site

USD 90,000 - 130,000

Full time

24 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Data Engineer to design and optimize scalable data pipelines and workflows using Databricks. The role requires proficiency in Databricks, Spark, and Delta Lake, along with strong skills in Python and SQL. The engineer will ensure data reliability and performance while collaborating with stakeholders. This position offers an opportunity to work on cutting-edge data solutions and contribute to the company's data strategy.

Qualifications

  • Certified in Databricks with strong experience in Python and SQL.
  • Proficient in ETL/ELT development and real-time data processing.
  • Familiar with cloud platforms and data governance tools.

Responsibilities

  • Designing and optimizing scalable data pipelines on Databricks.
  • Building efficient ETL/ELT pipelines for various data types.
  • Collaborating with stakeholders to ensure data quality and performance.

Skills

Databricks
Spark
Delta Lake
Python
SQL
ETL/ELT development
Real-time data processing
Cloud platforms
Data governance

Tools

Unity CatLog
AWS
Azure
GCP

Job description

We are looking for a Data Engineer who is certified in Databricks (required) to join our team. In this role you will be designing, developing, and optimizing scalable data pipelines and workflows on Databricks. The engineer will work closely with stakeholders to make certain data reliability, performance, and alignment with business requirements.





Scope of Work



Data Pipeline Development:



  • Building efficient ETL/ELT pipelines using Databricks and Delta Lake for structured, semi-structured, and unstructured data.
  • Transforming raw data into consumable datasets for analytics and machine learning.


Data Optimization:



  • Improving performance by implementing best practices like partitioning, caching, and Delta Lake optimizations.
  • Resolving bottlenecks and ensuring scalability.


Data Integration:



  • Integrating data from various sources such as APIs, databases, and cloud storage systems (e.g., AWS S3, Azure Data Lake).


Real-Time Streaming:



  • Designing and deploying real-time data streaming solutions using Databricks Structured Streaming.


Data Quality and Governance:



  • Implementing data validation, schema enforcement, and monitoring to ensure high-quality data delivery.
  • Using Unity CatLog to manage metadata, access permissions, and data lineage.


Collaboration and Documentation:



  • Collaborating with data analysts, data scientists, and other stakeholders to meet business needs.
  • Documenting pipelines, workflows, and technical solutions.




Responsibilities



Fully functional and documented data pipelines.



Optimized and scalable data workflows on Databricks.



Real-time streaming solutions integrated with downstream systems.



Detailed documentation for implemented solutions and best practices.





Skills and Qualifications



Proficiency in Databricks(certified), Spark, and Delta Lake.



Strong experience with Python, SQL, and ETL/ELT development.



Familiarity with real-time data processing and streaming.



Knowledge of cloud platforms (e.g., AWS, Azure, GCP).



Experience with data governance and tools like Unity CatLog.





Assumptions



Access to necessary datasets and cloud infrastructure will be provided.



Timely input and feedback from stakeholders.





Success Metrics



Data pipelines deliver accurate and consistent data.



Workflows meet performance benchmarks.



Real-time streaming solutions operate with minimal latency.



Stakeholders are satisfied with the quality and usability of the solutions.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead Azure Data Engineer/Architect

GSPANN Technologies, Inc

Fremont

Remote

USD 114,000 - 171,000

5 days ago
Be an early applicant

Data Analyst

Cushman & Wakefield

New York

Remote

USD 84,000 - 99,000

3 days ago
Be an early applicant

Senior Data Engineer

Jeavio

Remote

USD 100,000 - 130,000

17 days ago

Data Engineer II

Spearheadtech

Plano

Hybrid

USD 80,000 - 120,000

30+ days ago

Data Scientist II

Spearheadtech

Plano

On-site

USD 80,000 - 120,000

30+ days ago