Ahead of the Curve.
No one knows the road like Agero. For over 40 years, we have provided the safest, smartest solutions for drivers and the companies that keep them moving.
As a result, we have become an industry leader, providing vehicle manufacturers and insurance carriers with privately labeled state-of-the-art roadside assistance plans and efficient claims management solutions. Our Roadside Assistance network protects more than 75 million drivers each year, providing award-winning service that helps motorists in their time of need while building customer loyalty for our clients.
Headquartered in Medford, MA with operations throughout North America, we are trusted by more than 100 leading corporations and used by 75% of the new passenger vehicles sold in the U.S. As a result, we have more information about cars and drivers than any other company. We use that data to continually enrich our solutions, maximizing our customers' opportunities while minimizing driver distraction.
About the Role:
Agero’s Data Science and Analytics team is building a new high-performance, cloud-native data platform to support our analytical products and machine learning pipelines. The platform is built on AWS with Snowflake as the data warehouse. We rely heavily on Python and use Airflow to manage a variety of complex workflows. The platform provides the company as a whole with secure, centralized, and reliable access to all enterprise data. At the same time, it reduces effort required to ingest new sources of data or build workflows to support new modeling or reporting use cases.
Our applications require movement of a lot of complex data and involve a wide variety of underlying technologies. The ideal candidate is someone that loves to be challenged, learn, and is driven to improve efficiency of the development team they are working with. We are an agile, results driven group focused on delivering impactful products. We highly value experimentation, adaptability, curiosity and critical thought.
We are looking for an exceptional Data Engineer to join our team and play a key role both in continued development and in management of this new data platform.
The primary responsibilities of the role will be to help to gather requirements and build out new curated datasets within the platform as they are needed, and to maintain and govern existing data sets. The work will include validation of raw data coming into the platform as well as helping to enable data access from the platform to the large variety of downstream use cases including APIs, reports, dashboards, self-service tools, and data science efforts.
The individual who fills this role will also help to develop a framework for production machine learning (ML) pipelines that allow our data science team to test and deploy new models at scale. This framework will tightly integrate with the data platform described above. As part of this effort, the Data Engineer will develop data processing pipelines to support various modeling and forecasting efforts, create processes and tools for our Data Scientists and Engineers to quickly and reliably deploy new models into production, and design data models to expose data science outputs to a variety of downstream use cases.
Key Outcomes:
- Design data models and data management strategies to support various analytical and modeling applications
- Gather requirements for new curated data sets to support analytical applications
- Help maintain and govern existing data sources
- Develop robust, flexible ETL and data processing pipelines to support a variety of use cases
- Create and maintain automated data validation processes
- Implement a framework and tooling to support production machine learning pipelines
ADDITIONAL RESPONSIBILITIES MAY INCLUDE:
- Help to plan migration of reporting and end-user applications to the new platform
- Develop APIs that expose data to other internal applications
- Create impactful dashboards and visualizations
Skills, Experiences and Education:
- years of coding experience with expertise in Python and SQL
- Cloud computing (Ideally AWS technologies including S3, Lambda, Glue, DynamoDB)
- Data management and processing, including experience with Relational and Non-relational data stores (NoSQL, S3, Hadoop, etc.)
- Excellent communication skills both in written (technical documents, Python notebooks) and spoken (meetings, presentations) forms
- Willing and able to learn and meet business needs
- Independent, self-organizing, and able to prioritize multiple complex assignments
- Experience using Git and working on shared code repositories
It would be great if you also had experience with:
- Snowflake
- Apache Airflow
- Great Expectations
- API development
- Kubernetes
- Docker
- Tableau/Looker