Eviction Lab at Princeton University Logo

Eviction Lab at Princeton University

Data Engineer

Posted 4 Days Ago
Remote
Hiring Remotely in NJ
Mid level
Remote
Hiring Remotely in NJ
Mid level
The Data Engineer will develop data pipelines, optimize code, and create new data products to support research on the eviction crisis. They will work independently while also collaborating with a project lead, focusing on large datasets and code efficiency.
The summary above was generated by AI

Overview

The Department of Sociology at Princeton University seeks applicants for a full-time Data Engineer position in the Eviction Lab. Successful candidates will have a background in data science and/or computer science. The data engineer will contribute to the Eviction Lab at Princeton University’s mission to create data and research products to help researchers, policymakers, and community members understand the eviction crisis.

Salary is competitive and is benefits-eligible. Applicants should submit a dossier including: (1) a complete vita, (2) a cover letter of interest, (3) names and contact information of up to three persons who can serve as references, (4) a coding sample or data product that speaks to applicant’s experience with relevant tasks. All materials should be submitted as 1 continuous PDF. Applications will be considered on a rolling basis. Start date is flexible. Materials submitted by regular mail or email will not be accepted.

The responsibilities of the position are to lead the development of a data construction pipeline for processing large-scale administrative records. This would involve writing code to create new data products (e.g., geocoding addresses, cleaning names, combining multiple sources of data) in a reproducible way; writing tests to assess the quality of the data products created by the pipeline; writing tests to assess the speed of the pipeline; optimizing the code to improve quality and speed; cleaning and reformatting incoming datasets to conform to the pipeline; running the pipeline using these datasets; and identifying and fixing bugs, among other tasks. The datasets used are very large and require the use of remote computing clusters. Applicants with experience using very large datasets and optimizing code to run efficiently are preferred.

This is a one-year term position with the possibility of renewal. You would work directly with a project lead, but much of the work would be carried out independently. The ideal candidate is someone who is self-motivated and can identify the larger goals of the project and propose relevant, useful tasks in a self-directed way.

Responsibilities

Job duties include:

  • Improving existing code base: reviewing code base; designing tests to assess data quality; designing tests to assess speed and identify bottlenecks; rewriting code to optimize speed and quality and remove extraneous operations.
  • Developing a data pipeline for new datasets: preprocessing data to conform to uniform data standards; identifying missing data and making appropriate imputations; running standardized data through data construction pipeline; identifying and fixing bugs; assessing resulting data products for accuracy and completeness.
  • Leading the development of new data features and products: constructing new measures and assessing them for accuracy; incorporating new types of data and making measures based on them.

Qualifications

Essential Qualifications:

  • Bachelor's degree or equivalent
  • 3+ years of relevant experience
  • Extensive experience writing data pipelines written in python, specifically Pandas and GeoPandas
  • Extensive experience working with large datasets
  • Familiarity with mapping and geographic data processing
  • Familiarity with Git
  • Demonstrated ability to work independently
  • Knowledge of regular expressions (regex)

Preferred Qualifications:

  • Database management tools (e.g., SQL)
  • R
  • ArcGIS or other GIS software
  • Experience using administrative data

Apply

Applications must be submitted through the Princeton University careers sitehttps://main-princeton.icims.com/jobs/20792/data-engineer/job?hub=15&_gl=1*cqw49x*_ga*NDI5NDE5NTg5LjE3NDU1OTM3MDU.*_ga_5Y2BYGL910*czE3NDY3NjI5MDgkbzIkZzEkdDE3NDY3NjMwMzgkajI4JGwwJGgw&mobile=false&width=1095&height=500&bga=true&needsRedirect=false&jan1offset=-300&jun1offset=-240

Top Skills

Arcgis
Geopandas
Pandas
Python
R
SQL

Similar Jobs

16 Hours Ago
Remote
Hybrid
68 Locations
148K-317K Annually
Senior level
148K-317K Annually
Senior level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
As a Data Engineer Director, you will lead data initiatives, develop strategies, oversee projects, and maintain client relations to drive business growth.
Top Skills: AIAzureBig DataCloud Data ArchitecturesEltETLMl
4 Days Ago
Easy Apply
Remote
USA
Easy Apply
120K-160K
Mid level
120K-160K
Mid level
Big Data • Healthtech • HR Tech • Machine Learning • Software • Telehealth • Big Data Analytics
As a Data Engineer II, you will collaborate to build and support data platforms, ensuring secure access and high-quality data sets while engaging with multiple teams to convert ideas into production systems.
Top Skills: ArgoAWSCloudwatchDatadogDbtDockerKubernetesPrometheusPythonSnowflakeSQLTerraform
7 Days Ago
Remote
United States
145K-197K Annually
Senior level
145K-197K Annually
Senior level
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
Build scalable analytics pipelines with modern data technologies, define data models, collaborate across teams, and optimize data processing systems.
Top Skills: AirflowC++DatabricksHivesqlJavaMontecarloPythonScalaSparkSparksqlSQL

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account