Enable job alerts via email!

Lead Data Engineer

WorkHQ

Los Angeles (CA)

Remote

USD 140,000 - 180,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a well-funded startup aiming to revolutionize recruiting through technology as a Lead Data Engineer. You'll manage a growing data infrastructure and lead initiatives using cutting-edge technologies, all in a remote working environment. The ideal candidate has 5-8 years of experience, strong technical skills in data handling, and a passion for leadership.

Qualifications

  • 5-8 years of professional data engineering experience required.
  • Proficiency in PySpark and AWS services is a must.
  • Strong background in big data processing architectures is necessary.

Responsibilities

  • Design data pipelines that process massive volumes of records.
  • Architect ETL processes using PySpark on Amazon EMR.
  • Integrate new data sources into the main pipeline.

Skills

PySpark
AWS data services
Docker
Pandas
SQL

Tools

EMR
Postgres
OpenSearch
DataBricks
Snowflake
Splink

Job description

Join to apply for the Lead Data Engineer role at WorkHQ

Join to apply for the Lead Data Engineer role at WorkHQ

This range is provided by WorkHQ. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$140,000.00/yr - $180,000.00/yr

WorkHQ is an all-in-one recruiting platform that provides: 1. Database of 100M US professionals 2. Email and phone number lookup 3. Email outreach and sequencing 4. Applicant tracking system Recruiting well can set a company up for long term success, while poor recruiting can setup a company for failure. We are working on a bold mission to replace the current jumble of multiple expensive and confusing systems all into a single platform at an affordable price.

Company Context

Series A, well-funded US startup in HRTech developing WorkHQ.com and an AI Recruiter product.

This is a US-only, Remote role (Mainland).

Role Overview

Lead data infrastructure architect managing billions of data points across 250M+ professional profiles.

Hire data engineers to aid you in that journey.

Core Responsibilities

  • Design scalable data pipelines processing massive record volumes
  • Architect ETL processes using PySpark on Amazon EMR (Open to shifting to other solutions like Data Bricks / Snowflake)
  • Distribute enriched data through medallion architecture across Postgres, Athena, OpenSearch
  • Integrate new data sources into the main pipeline
  • Implement advanced data matching using Splink

Technical Requirements

  • 5-8 years professional data engineering experience
  • Good proficiency in:
    • PySpark and distributed computing
    • AWS data services (EMR, Glue, Athena)
    • Docker
    • Pandas and DataFrame manipulation
    • Complex data format handling (JSONL, Parquet)
  • Strong background in:
    • Big data processing architectures
    • Data warehouse design
    • Performance optimization
  • Advanced Python, SQL skills
Nice to Have

  • Probabilistic record linking expertise
  • OpenSearch/elasticsearch technologies
  • Machine learning data pipeline design
  • Recruitment tech ecosystem knowledge

Technical Stack

  • Big Data: PySpark, EMR
  • Databases: Postgres, OpenSearch
  • Cloud: AWS
  • Containerization: Docker
  • Data Formats: JSONL, Parquet
  • Analytics: Metabase, Athena, Glue
  • Data Processing: Pandas, Splink

Other Considerations

While this role has specific requirements - if you lack a few technical skills, but motivated to learn and lead the platform, please apply for consideration.

If you are coming from Director/Head of/VP levels that is relevant to this job, you can apply as well.

You will need to apply directly on our platform.

Thank you for your time.

The role requires 5-8 years of professional data engineering experience with proficiency in PySpark, AWS data services, Docker, and data manipulation using Pandas. Candidates should have a strong background in big data processing architectures, data warehouse design, and performance optimization, along with advanced skills in Python and SQL.

Join a well-funded US startup in HRTech with the opportunity to lead data infrastructure projects remotely. Work with cutting-edge technologies and a talented team in a dynamic environment.

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Other
  • Industries
    IT Services and IT Consulting

Referrals increase your chances of interviewing at WorkHQ by 2x

United States $170,000.00-$720,000.00 1 week ago

Alhambra, CA $105,000.00-$115,000.00 2 weeks ago

Los Angeles Metropolitan Area $105,000.00-$230,000.00 16 hours ago

Los Angeles, CA $75.00-$110.00 2 weeks ago

Full-Stack Developer & Data Visualization Engineer
Python and Kubernetes Software Engineer - Data, AI/ML & Analytics

Los Angeles, CA $140,000.00-$170,000.00 1 month ago

Python and Kubernetes Software Engineer - Data, Workflows, AI/ML & Analytics

Los Angeles, CA $60,000.00-$90,000.00 1 month ago

Senior Software Engineer - Platform & Resiliency
Staff Engineer - Finance Data Specialist (Remote)

Los Angeles, CA $115,000.00-$260,000.00 7 hours ago

Los Angeles Metropolitan Area $172,500.00-$227,500.00 2 weeks ago

Distributed Systems Software Engineer, Python / Go

Los Angeles Metropolitan Area $90,000.00-$185,000.00 16 hours ago

Los Angeles, CA $105,000.00-$215,000.00 16 hours ago

Software Engineer (Python/Linux/Packaging)

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Lead Data Engineer

WorkHQ

Beverly Hills

Remote

USD 120,000 - 180,000

2 days ago
Be an early applicant

Lead Data Engineer

Jobot

Grand Prairie

Remote

USD 150,000 - 190,000

2 days ago
Be an early applicant

Lead Data Engineer Architect (Exp. with Data Vault 2.0)

Technogen, Inc.

Remote

USD 97,000 - 720,000

2 days ago
Be an early applicant

Lead Data Engineer

Health-E Commerce

Remote

USD 150,000 - 180,000

2 days ago
Be an early applicant

Lead Data Engineer

Philo

California

Remote

USD 172,000 - 237,000

6 days ago
Be an early applicant

Lead Data Engineer - Remote

Jobot

Levittown

Remote

USD 130,000 - 220,000

4 days ago
Be an early applicant

[Hiring] Lead Data Engineer @Wisersolutions

Wisersolutions

Remote

USD 120,000 - 160,000

3 days ago
Be an early applicant

Lead Data Engineer

RightClick

Remote

USD 97,000 - 720,000

6 days ago
Be an early applicant

Lead Data Engineer

FSAStore.com

Remote

USD 150,000 - 180,000

3 days ago
Be an early applicant