Supplier.io Logo

Supplier.io

Senior Data Scientist

Posted Yesterday
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
The Senior Data Scientist will design ML-based entity resolution systems, build and refine NLP and ML models, and translate model results into business impact while supporting data strategy and improving data pipelines.
The summary above was generated by AI

Supplier.io is the market leader in supplier intelligence, trusted by over half of the Fortune 100 to power smarter, more responsible sourcing decisions. Our platform helps corporate procurement teams discover, evaluate, and engage with over 11 million suppliers with a focus on local, small, diverse, and sustainable businesses. This helps organizations build supply chains that are resilient, inclusive, and built for impact.


Our solutions empower today’s procurement teams with accurate data, actionable insights, and measurable impact, which helps them mitigate risk, expand sourcing options, achieve ESG goals, and advance economic inclusion. Whether tracking spend, sourcing alternate suppliers, or measuring program results, Supplier.io transforms complexity into clarity; empowering teams to lead with confidence and build supply chains that deliver for both business and community.


Join a company committed to innovation, inclusion, and making a difference one sourcing decision at a time. For more information, visit www.supplier.io.


The Opportunity

 

Supplier.io is expanding our data team and is seeking a Senior Data Scientist with a strong data science orientation to play a critical role in scaling and modernizing our supplier intelligence platform. This role is weighted approximately 80% toward data science and 20% toward data engineering, which is ideal for someone with deep, hands-on experience building and training ML and NLP models and who is equally comfortable operationalizing those models within production data pipelines. You will bring strong architectural thinking, thrive in complex environments, and enjoy mentoring others while collaborating across teams, geographies, and disciplines.

 

A central focus of this role is Entity Resolution, which is the process of identifying, linking, and merging records across disparate data sources that refer to the same real-world entity (suppliers in our case). This involves resolving inconsistencies, handling missing data, and eliminating duplicates to create a single, accurate, and trustworthy supplier profile, often referred to as a “golden record” or 360-degree view. Our current systems leverage Lucene-based search and XGBoost ML models, and we are exploring the use of LLMs to further enhance these capabilities. The ideal candidate will improve and reimagine our existing legacy entity resolution systems, bringing experience with ML-based approaches to matching and deduplication at scale.

 

As a Senior Data Scientist, you will drive, shape, and execute our long-term data and data science strategy, design resilient and scalable data architectures, and champion technical excellence across our data ecosystem. You will work closely with Product and the Engineering teams to ensure our data systems support business growth, advance our matching capabilities, and enable data-driven decision-making.

 

To support Supplier.io growth, we are investing heavily in cloud-native technologies. This role will be instrumental in leveraging modern data services and ML capabilities, optimizing cost, and ensuring our data platform is secure, reliable, and scalable.

 

What You Will Do

 

  • Design, build, and iterate on ML-based entity resolution systems that match, link, and deduplicate supplier records across disparate data sources to produce trusted golden records.
  • Build, train, and refine NLP and ML models (e.g., XGBoost, search ranking models) for supplier matching, classification, and data enrichment, with a focus on improving accuracy and recall.
  • Evaluate and integrate emerging approaches, including LLMs, into our entity resolution and data intelligence workflows.
  • Own the full ML model lifecycle: feature engineering, training, evaluation, monitoring, feedback loops, and iterative tuning in partnership with data engineering and product teams.
  • Translate model results into business impact and clearly communicate tradeoffs, performance metrics, and recommendations to non-technical stakeholders.
  • Build and maintain data products end-to-end, operationalize them within production data pipelines, and ensure they deliver reliable, scalable results.
  • Execute and influence a cohesive data strategy that aligns with company objectives and supports analytics, reporting, and downstream product use cases.
  • Own complex data modeling initiatives, including dimensional and analytical models that support business intelligence and advanced analytics.
  • Drive continuous improvement by optimizing data pipelines, query performance, reliability, observability, and cost efficiency.
  • Partner with Infrastructure, Product, and Engineering teams to ensure data systems meet best practices, security standards, and business needs.
  • Create and maintain comprehensive technical documentation, including architecture diagrams, data flow maps, runbooks, and operations procedures.
  • Troubleshoot and resolve complex, cross-system data issues and incidents.

 

What You Will Need to Succeed:

 

  • Bachelor’s degree in Data Science, Computer Science, Machine Learning, Statistics, Engineering, or a related field.
  • 7+ years of progressive experience in data science and/or data engineering, with demonstrated ownership of ML-based systems in production environments. At least 2 years in a senior or lead capacity preferred.
  • Hands-on experience building NLP and LLM-based models in Python for real-world data science applications.
  • Strong understanding of ML model lifecycle considerations, including evaluation, monitoring, feedback loops, and iterative tuning in partnership with data engineering and product teams.
  • Strong ability to translate model results into business impact and communicate tradeoffs to non-technical stakeholders.
  • Direct experience building or significantly improving entity resolution or search ranking systems, including ML-based approaches to record matching, linking, and deduplication at scale.
  • Proficiency with ML frameworks and tools such as XGBoost, scikit-learn, PyTorch, or TensorFlow, and familiarity with search technologies such as Lucene/Elasticsearch.
  • Demonstrated ability to build and maintain data products end-to-end by operationalizing models within production data pipelines, not solely tuning them.
  • Advanced proficiency with Python and SQL for both data science and data engineering workflows.
  • Experience with Snowflake and cloud-native data platforms (Azure, AWS, GCP, or multi-cloud environments).
  • Familiarity with data modeling, ETL/ELT processes, and modern data warehousing principles.
  • Experience working in an agile development environment and collaborating through ticketing systems such as Jira and Github.
  • Ability to communicate technical concepts clearly to technical and non-technical teams and influence decision-making.
  • Strong problem-solving skills with the ability to troubleshoot and resolve ambiguous, high-impact issues.
  • A results-oriented mindset with a demonstrated history of driving process improvements and technical excellence.
  • Ability to work independently while also serving as a trusted technical partner and mentor to others.
  • Ability to take vague requirements and turn them into technical roadmaps.

We do no accept unsolicited resumes from recruitment/search firms.  

Supplier.io participates in E-Verify. For more information, click here. We will provide the Social Security Administration and, if necessary, the Department of Homeland Security, with information from each new employee’s Form I-9 to confirm work authorization. 

Supplier.io is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without regard to race color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status. 

Supplier.io is unable to sponsor work visas (e.g., H-1B, TN, OPT, etc.) for US positions.

If you require reasonable accommodation to complete the application or interview process, please contact the Human Resources department at [email protected] or 978-843-5747. 



Similar Jobs

6 Days Ago
In-Office or Remote
IN, USA
175K-190K Annually
Senior level
175K-190K Annually
Senior level
Consumer Web • eCommerce • Food • Healthtech • Natural Language Processing • Social Impact
As a Senior Data Scientist at Thrive Market, you'll lead impactful data science initiatives, collaborate with cross-functional teams, and drive improvements in customer experience and business metrics through machine learning and statistical analysis.
Top Skills: Aws SagemakerDockerEcrKubernetesLambdaPythonS3SQL
20 Days Ago
Easy Apply
In-Office or Remote
United States
Easy Apply
170K-200K Annually
Senior level
170K-200K Annually
Senior level
Artificial Intelligence • Hardware • Healthtech • Software
The Senior Data Scientist will build models and analyses, design experiments, integrate datasets, and leverage AI for improved workflows and insights in data science.
Top Skills: DatabricksMlflowPandasPythonPyTorch
4 Days Ago
Remote
Georgia, USA
90K-190K Annually
Senior level
90K-190K Annually
Senior level
Retail
The Sr. Data Scientist focuses on price optimization using deep learning and data science to enhance profitability and efficiency, while managing projects and mentoring junior staff.
Top Skills: GurobiPythonSQL

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account