Data Engineering Intern
Company Description
Tamr is the enterprise data mastering company trusted by large enterprises like Blackstone, the US Air Force, Toyota, and GSK. The company’s patented software platform uses machine learning supplemented with human feedback to master and prepare data across myriad silos to deliver previously unavailable business-changing insights. With a co-founding team led by Andy Palmer (founding CEO of Vertica) and Mike Stonebraker (Turing Award winner) and backed by top-tier investors such as NEA and GV, Tamr is transforming how companies get value from their data.
You will work on the enrichment and data products team at Tamr, playing an active role in product development. You’ll gain new skills related to data engineering and analysis through hands-on exposure to real challenges facing our enterprise customers. You’ll learn how multibillion-dollar companies are using machine learning and leading-edge technologies to modernize their infrastructure and turn their data into a competitive advantage. Throughout the summer, you will get broad exposure to teams and leaders throughout Tamr, providing you with insight into the operations of a growth-stage company and the range of potential opportunities.
Responsibilities
- Perform exploratory data analysis on new data sources
- Develop algorithms for cleaning data and engineering features
- Build data pipelines to feed data to machine learning models
- Collaborate with software engineers on developing applications to deliver data
Challenges that make this job interesting:
- The problem we’re solving is hard - enterprise data is messy and there is a lot of it. It’s our job to derive value from this data in a flexible and scalable way
- We’re working at the cutting edge - we’re responsible for the innovation, experimentation, and development that goes into building new products that our customers find useful
Qualifications
- All undergrad and graduate students are welcome to apply
- Major or minor in a technical field
- Demonstrated interest and aptitude for working with data
- You’ve built data pipelines to prepare data for analysis
- You’ve experience of data analysis, data cleaning, and feature engineering
- You’ve written code in python and SQL
Other Preferred Qualifications / Nice to have:
- You’ve experience with any of the following technologies: Spark, GCP, AWS, Azure, GitHub
- You’ve built, trained, and tested machine learning models on a variety of datasets