Enable job alerts via email!

Data Engineer

BridgePoint Associates

Princeton (NJ)

Remote

USD 90,000 - 120,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

The Department of Sociology at Princeton University is seeking a full-time Data Engineer for the Eviction Lab. This role involves developing data pipelines, optimizing code, and working with large datasets to aid research on the eviction crisis. The ideal candidate will be self-motivated and possess strong skills in Python and data processing. The position is remote and offers competitive salary and benefits.

Qualifications

  • 3+ years of relevant experience required.
  • Extensive experience writing data pipelines in Python, specifically Pandas and GeoPandas.
  • Familiarity with mapping and geographic data processing.

Responsibilities

  • Lead the development of a data construction pipeline for processing large-scale administrative records.
  • Improve existing code base and optimize speed and quality.
  • Develop new data features and products.

Skills

Python
Data Processing
Geographic Data
Git
Regular Expressions

Education

Bachelor's degree or equivalent

Tools

SQL
R
ArcGIS

Job description

Data Engineer

Eviction Lab at Princeton University

Remote - one year term position with possibility of renewal

Overview

The Department of Sociology at Princeton University seeks applicants for a full-time Data Engineer position in the Eviction Lab. Successful candidates will have a background in data science and/or computer science. The data engineer will contribute to the Eviction Lab at Princeton University’s mission to create data and research products to help researchers, policymakers, and community members understand the eviction crisis.

Salary is competitive and is benefits-eligible. Applicants should submit a dossier including: (1) a complete vita, (2) a cover letter of interest, (3) names and contact information of up to three persons who can serve as references, (4) a coding sample or data product that speaks to applicant’s experience with relevant tasks. All materials should be submitted as 1 continuous PDF. Applications will be considered on a rolling basis. Start date is flexible. Materials submitted by regular mail or email will not be accepted.

The responsibilities of the position are to lead the development of a data construction pipeline for processing large-scale administrative records. This would involve writing code to create new data products (e.g., geocoding addresses, cleaning names, combining multiple sources of data) in a reproducible way; writing tests to assess the quality of the data products created by the pipeline; writing tests to assess the speed of the pipeline; optimizing the code to improve quality and speed; cleaning and reformatting incoming datasets to conform to the pipeline; running the pipeline using these datasets; and identifying and fixing bugs, among other tasks. The datasets used are very large and require the use of remote computing clusters. Applicants with experience using very large datasets and optimizing code to run efficiently are preferred.

This is a one-year term position with the possibility of renewal. You would work directly with a project lead, but much of the work would be carried out independently. The ideal candidate is someone who is self-motivated and can identify the larger goals of the project and propose relevant, useful tasks in a self-directed way.

Responsibilities

Job duties include:

  • Improving existing code base: reviewing code base; designing tests to assess data quality; designing tests to assess speed and identify bottlenecks; rewriting code to optimize speed and quality and remove extraneous operations.
  • Developing a data pipeline for new datasets: preprocessing data to conform to uniform data standards; identifying missing data and making appropriate imputations; running standardized data through data construction pipeline; identifying and fixing bugs; assessing resulting data products for accuracy and completeness.
  • Leading the development of new data features and products: constructing new measures and assessing them for accuracy; incorporating new types of data and making measures based on them.
Qualifications

Essential Qualifications:

  • Bachelor's degree or equivalent
  • 3+ years of relevant experience
  • Extensive experience writing data pipelines written in python, specifically Pandas and GeoPandas
  • Extensive experience working with large datasets
  • Familiarity with mapping and geographic data processing
  • Familiarity with Git
  • Demonstrated ability to work independently
  • Knowledge of regular expressions (regex)

Preferred Qualifications:

  • Database management tools (e.g., SQL)
  • R
  • ArcGIS or other GIS software
  • Experience using administrative data
Application Link: https://main-princeton.icims.com/jobs/20792/data-engineer/job?hub=15&_gl=1*cqw49x*_ga*NDI5NDE5NTg5LjE3NDU1OTM3MDU.*_ga_5Y2BYGL910*czE3NDY3NjI5MDgkbzIkZzEkdDE3NDY3NjMwMzgkajI4JGwwJGgw&mobile=false&width=1095&height=500&bga=true&needsRedirect=false&jan1offset=-300&jun1offset=-240
Applicants should submit a dossier including: (1) a complete vita, (2) a cover letter of interest, (3) names and contact information of up to three persons who can serve as references, (4) a coding sample or data product that speaks to applicant’s experience with relevant tasks. All materials should be submitted as 1 continuous PDF. Applications will be considered on a rolling basis.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Data Engineer

Well

New York

Remote

USD 115,000 - 145,000

Today
Be an early applicant

Data Engineer - TELECOMMUTE

Randstad USA

Minneapolis

Remote

USD 80,000 - 100,000

Today
Be an early applicant

Data Engineer

TieTalent

Oregon

Remote

USD 100,000 - 130,000

Yesterday
Be an early applicant

DATA ENGINEER / SOFTWARE ENGINEER III

Mitchell Martin

Raleigh

Remote

USD 105,000 - 150,000

Yesterday
Be an early applicant

Title Data Engineer (Decision Science)

Paramount Pictures

New York

Remote

USD 85,000 - 130,000

Yesterday
Be an early applicant

Data Engineer

American Heart Association

Dallas

Remote

USD 100,000 - 110,000

4 days ago
Be an early applicant

Data Engineer

American Heart Association

Dallas

Remote

USD 100,000 - 110,000

2 days ago
Be an early applicant

AWS Data Engineer - Fully Remote - US Only

Scalepex

Plano

Remote

USD 90,000 - 140,000

5 days ago
Be an early applicant

Principal Data Engineer

Careabout

New York

Remote

USD 90,000 - 150,000

7 days ago
Be an early applicant