Senior Data Engineer
Agero is powering the next generation of software-enabled driver safety services and technology, pushing the limits of big data to transform the entire driving experience. The majority of leading vehicle manufacturers and insurance providers use Agero’s roadside assistance, accident management, dispatch, consumer affairs and telematics innovations to strengthen their businesses and create stronger, lasting connections with their customers. Together, we’re making driving smarter and safer for everyone.
The Data Science and Analytics group at Agero is a central resource for innovative data products, scientific analysis, and actionable insights. We are a collaborative, consultative team that works cross functionally to:
- support partners throughout the organization in making informed, data driven decisions,
- unlock the value within our data to create innovative new product offerings, drive efficiency, and improve customer experience, and
- provide greater access to information and insights through dashboards, data self-service tools, and training.
We believe that data is a key asset, and thanks to Agero’s scale and history, is a true competitive advantage.
About the Role:
Agero’s Data Science and Analytics team is developing a modern, cloud-based data platform to support our analytical products and machine learning pipelines. Our team is expanding and we are looking to add an experienced, self-motivated Senior level Data Engineer to help design and build out a curated, “single source of truth” data set that will feed a variety of downstream use cases including APIs, reports, dashboards, self-service tools, and modeling efforts. The platform will provide the Data Science and Analytics team along with the broader organization with a more secure, centralized, and reliable access to the company’s data. At the same time, it will reduce effort required to ingest new sources of data or to build new ETL processes to support new modeling or reporting use cases.
Additionally, this individual will play a lead role in building out a framework for production machine learning pipelines that allow our data science team to test and deploy new models at scale. This framework will tightly integrate with the data platform described above. As part of this effort, the Senior Data Engineer will be tasked with developing data processing pipelines to support various modeling and forecasting efforts, defining a process for our Data Scientists and Engineers to quickly and reliably deploy new models into production, and designing data models to expose model outputs to a variety of downstream use cases. Both efforts will leverage similar cloud technologies, and the data platform will be the “source” and “sink” of data to and from the ML pipeline(s).
- Define requirements and play lead role in architecting of a modern, cloud-based data platform
- Build out and implement a platform for production machine learning pipelines
- Design data models and data management strategies to support various analytical and modeling applications
- Develop robust and flexible ETL and data processing pipelines to support a variety of use cases
- Design and deploy self-service tools, dashboards, and APIs leveraging the “single source of truth” data
ADDITIONAL RESPONSIBILITIES INCLUDE:
- Develop infrastructure to deploy models at scale
- Play a lead role in the migration of existing applications to the newly developed platform
- Help maintain existing data sources, add new data sources
- Develop new curated data sets to support new analytical efforts.
- Leveraging ML metrics, forecast series of key metrics at a regular cadence with long time horizons.
- Develop APIs that expose data to other internal applications
- Create dashboards and visualizations
Skills, Experiences and Education:
- 6-8 years of Coding experience leveraging Python and SQL
- Advanced degree (MS or PH.D) in a technical field strongly preferred
- Cloud computing (Ideally AWS tech including S3, Redshift, Lambda, Glue, DynamoDB)
- Data management and processing, including experience with Relational and Non-relational data stores (NoSQL, S3, Hadoop, etc.)
- Good communication skills both in written (technical documents, Python notebooks) and spoken (meetings, presentations) forms.
- Willing and able to learn and meet business needs.
- Independent, self-organizing, and able to prioritize multiple complex assignments
- Experience using Git and working on shared code repositories
- Backend web (API) development
- Apache Airflow