Sayari Logo

Sayari

Data Engineering Intern

Reposted 2 Days Ago
Remote
Hiring Remotely in United States
Internship
Remote
Hiring Remotely in United States
Internship
Assist the Data Engineering team in collecting global data, maintaining ETL pipelines, and developing new ones for Sayari Graph.
The summary above was generated by AI

About Sayari: 

Sayari is the counterparty and supply chain risk intelligence provider trusted by government agencies, multinational corporations, and financial institutions. Its intuitive network analysis platform surfaces hidden risk through integrated corporate ownership, supply chain, trade transaction and risk intelligence data from over 250 jurisdictions. Sayari is headquartered in Washington, D.C., and its solutions are used by thousands of frontline analysts in over 35 countries.


Our company culture is defined by a dedication to our mission of using open data to enhance visibility into global commercial and financial networks, a passion for finding novel approaches to complex problems, and an understanding that diverse perspectives create optimal outcomes. We embrace cross-team collaboration, encourage training and learning opportunities, and reward initiative and innovation. If you like working with supportive, high-performing, and curious teams, Sayari is the place for you.


Internship Description:

Sayari is looking for an intern to join its Data Engineering team! Sayari’s flagship product, Sayari Graph, provides instant access to structured business information from billions of corporate, legal, and trade records. As a member of Sayari's data team you will work with our Product and Software Engineering teams to collect data from around the globe, maintain existing ETL pipelines, and develop new pipelines that power Sayari Graph.


Our application tier is built primarily in TypeScript, running in Kubernetes, and backed by Postgres, Cassandra, Elasticsearch, and Memgraph. Our data ingest tier runs on Spark, processing terabytes of data collected from hundreds of data sources. The platform allows users to explore a large knowledge graph sourced from hundreds of millions of structured and unstructured records from over 200 countries and 30 languages. As part of this team, you'll have the chance to contribute to our growing library of open-source work, including our WebGL-powered network visualization library Trellis.


This is a remote paid internship with work expectations being between 20-30 hours a week.

Job Responsibilities:

  • Write and deploy crawling scripts to collect source data from the web
  • Write and run data transformers in Scala Spark to standardize bulk data sets
  • Write and run modules in Python to parse entity references and relationships from source data
  • Diagnose and fix bugs reported by internal and external users
  • Analyze and report on internal datasets to answer questions and inform feature work
  • Work collaboratively on and across a team of engineers using basic agile principles
  • Give and receive feedback through code reviews

Required Skills & Experience:

  • Experience with Python and/or a JVM language (e.g., Scala)
  • Experience working collaboratively with git

Desired Skills & Experience:

  • Experience with Apache Spark and Apache Airflow
  • Experience working on a cloud platform like GCP, AWS, or Azure
  • Understanding of or interest in knowledge graphs

What We Offer: 

·       A collaborative and positive culture - your team will be as smart and driven as you

·       Limitless growth and learning opportunities

·       A strong commitment to diversity, equity, and inclusion

·       Team building events & opportunities

 

Sayari is an equal opportunity employer and strongly encourages diverse candidates to apply. We believe diversity and inclusion mean our team members should reflect the diversity of the United States. No employee or applicant will face discrimination or harassment based on race, color, ethnicity, religion, age, gender, gender identity or expression, sexual orientation, disability status, veteran status, genetics, or political affiliation. We strongly encourage applicants of all backgrounds to apply.

Top Skills

Cassandra
Elasticsearch
Kubernetes
Memgraph
Postgres
Spark
Typescript

Similar Jobs

3 Hours Ago
Easy Apply
Remote
2 Locations
Easy Apply
157K-218K Annually
Senior level
157K-218K Annually
Senior level
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Staff Data Analyst, you'll collaborate with Capital Markets and Lending teams to influence Upstart's capital strategy while driving insights through data and analytics. You'll build financial models, mentor junior analysts, and ensure impactful decision-making across the organization.
Top Skills: DatabricksDbtLookerPower BIPythonRRedshiftSnowflakeSQLTableau
3 Hours Ago
Remote
Hybrid
United States
40-48
Senior level
40-48
Senior level
Healthtech • Insurance
The Payment Integrity Analyst III will manage programs in payment integrity, focusing on operational management, financial growth, vendor relationships, and data-driven program improvements.
Top Skills: Excel
3 Hours Ago
Remote
United States
122K-207K Annually
Senior level
122K-207K Annually
Senior level
Big Data • Transportation • Analytics • Big Data Analytics
Lead the development of machine learning algorithms for advertising data, optimizing the marketing platform using predictive analytics and advanced models. Collaborate with cross-functional teams to enhance ad performance and revenue generation through data-driven insights.
Top Skills: BigQueryPythonPyTorchRedshiftScikit-LearnSparkSpark Ml-LibTensorFlow

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account