Marble Logo

Marble

Senior Data Engineer

Posted 10 Days Ago
In-Office
Cambridge, MA, USA
Senior level
In-Office
Cambridge, MA, USA
Senior level
The Senior Data Engineer will design, implement, and manage data pipelines for real-time analytics and ML training datasets, ensuring efficient data flow and quality in a hybrid environment.
The summary above was generated by AI

Marble is a technology company founded to revolutionize the food processing industry. Marble is seeking a full-time Senior Data Engineer who is ready for a challenge and eager to design, implement, and support automation solutions that are transforming the industry. As a part of the Marble team, you will leverage cutting-edge technologies to develop the next generation of automated solutions for food processing, enhancing resilience in the food supply chain.

Job Summary:

As a Senior Data Engineer at Marble, you will own the design and performance of data pipelines that power everything from real-time classification dashboards to ML training datasets to operational analytics for production facilities. You will work closely with Software, Infrastructure, and Machine Learning teams to ensure data flows efficiently through our pipelines securely, reliably, and at scale.

You will design for both high-throughput real-time ingestion and large-scale batch processing across on-prem edge nodes and AWS.

Responsibilities:
  • Architect and build scalable ETL/ELT pipelines for both batch and streaming workloads

  • Design real-time ingestion and transformation workflows integrating NATS JetStream and distributed microservices

  • Develop robust data models and ETL layers for ClickHouse, enabling high-performance analytics and ML feature extraction

  • Manage and optimize data storage across AWS S3, ClickHouse, and operational datasets generated on-prem

  • Build automation workflows for labeling data, CV pipeline pre-annotation, dataset generation, and versioning

  • Ensure data quality, validation, integrity, and lineage, including automated tests and monitoring across pipelines

  • Collaborate with ML and backend teams to deliver pipelines for training datasets and annotation tools.
    Implement scalable compute workloads for large dataset transformations

  • Define and enforce data governance best practices, including schema evolution, retention policies, and compliance requirements

  • Monitor and improve data pipeline performance across multi-region environments

Minimum Qualifications:
  • B.S. or M.S. in Computer Science, Data Engineering, or related field

  • 4+ years of experience building production-grade data pipelines or distributed systems

  • Strong proficiency in Python and SQL

  • Production experience with at least one major distributed compute framework, Apache Spark, Ray, or Apache Airflow (2+ years preferred)

  • Experience building streaming pipelines or real-time systems (Kafka, NATS, Redis Streams, or similar)

  • Deep familiarity with AWS cloud services (S3, Lambda, IAM, EC2, Glue etc.)

  • Experience with PostgreSQL, MongoDB, Clickhouse or other columnar/NoSQL systems

  • Strong understanding of data modeling, partitioning, schema evolution, and performance tuning

  • Understanding of data quality, lineage, orchestration, and governance

  • Ability to design systems in hybrid environments (on-prem + cloud)

  • Excellent communication, documentation, and teamwork skills

Preferred Qualifications:
  • Experience with NATS JetStream, Kafka, or high-throughput messaging systems

  • Familiarity with GPU-based CV pipelines, ML datasets, or annotation workflows

  • Experience with ClickHouse Materialized Views, Replicated Tables, or S3-backed storage

  • Experience working in a regulated, safety-critical, or high-uptime environment

  • Experience with Nomad, Consul, Vault, or HashiCorp ecosystem

Job Type: Full-time

Location: Lincoln, NE - US, Omaha, NE - US, or Cambridge, MA - US

Team members can expect occasional travel for in-person meetings and site visits.

Marble is an equal-opportunity employer. We understand the power of a diverse team, celebrate differences, and promote inclusion.

Similar Jobs

2 Days Ago
In-Office
64K-64K Annually
Senior level
64K-64K Annually
Senior level
Food • Logistics • Retail
The Senior Data Engineer will lead the design and operation of scalable data pipelines, ensuring performance and compliance while mentoring team members and collaborating on data solutions.
Top Skills: AzureDatabricksMqttSpark
12 Days Ago
Remote or Hybrid
United States
156K-263K Annually
Senior level
156K-263K Annually
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Senior Staff Data Engineer will design scalable data systems, lead technical direction, optimize data platforms, and ensure data quality and performance for Observability & Insights.
Top Skills: AirflowAWSAzureCassandraDatadogDbtFlinkGCPGrafanaPrometheusSailpointSnowflakeSpark
16 Days Ago
Easy Apply
Remote or Hybrid
USA
Easy Apply
155K-221K Annually
Senior level
155K-221K Annually
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
The Senior Specialist Sales Engineer will partner with sales teams to provide technical presentations, gather customer requirements, lead evaluations, and create tailored solutions for commercial clients. They must have pre-sales experience, particularly in data protection and networking solutions, and be able to work collaboratively to achieve successful outcomes.
Top Skills: Data SecuritySaaSWeb Technologies

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account