C the Signs Logo

C the Signs

AI Data Engineer

Reposted 5 Days Ago
In-Office or Remote
4 Locations
Mid level
In-Office or Remote
4 Locations
Mid level
The Data Engineer will develop and optimize data pipelines for LLMs, ensuring high-quality healthcare datasets while collaborating with data scientists on model requirements.
The summary above was generated by AI
Position Summary

The Data Engineer will play a crucial role in developing and fine-tuning data specifically for our LLMs and machine learning models. This individual will be responsible for the entire data lifecycle, including gathering, cleaning, structuring, and optimizing large, diverse healthcare datasets. The ideal candidate will have a strong background in data engineering principles, experience with big data technologies, and a keen understanding of the unique challenges and requirements of healthcare data.

You will design, build, and maintain scalable data pipelines that source, preprocess, and deliver high-quality, high-volume datasets to our machine learning engineers. This role requires a deep understanding of data engineering best practices coupled with specific knowledge of the data requirements for LLM training and refinement

Key Responsibilities
  • Collaborate with data scientists and machine learning engineers to understand data requirements for LLM and machine learning model fine-tuning.
  • Design, build, and maintain scalable data pipelines to ingest, process, and store massive and diverse healthcare datasets.
  • Implement robust data validation and monitoring to ensure the integrity, accuracy, and consistency of all training datasets.
  • Implement robust data cleaning, validation, and transformation processes to ensure data quality and integrity.
  • Develop and optimize data structures and schemas for efficient access and utilization by LLMs and machine learning models.
  • Work with the team to identify and acquire new data sources, ensuring compliance with relevant healthcare regulations (e.g., HIPAA).
  • Monitor data pipeline performance, troubleshoot issues, and implement optimizations to improve efficiency and reliability.
  • Document data engineering processes, data models, and data dictionaries.
  • Stay up-to-date with the latest advancements in data engineering, big data technologies, and machine learning.

RequirementsRequired
  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • Proven experience as a Data Engineer, with a focus on big data technologies.
  • Strong proficiency in programming languages such as Python, Scala, or Java.
  • Extensive experience with data warehousing, ETL processes, and data modeling.
  • Experience with major cloud providers (e.g., AWS, GCP, Azure) and their data storage and processing services.
  • Hands-on experience with big data frameworks like Apache Spark for distributed processing.
  • Excellent problem-solving skills and the ability to work independently and as part of a team.
  • Strong communication and interpersonal skills.
Preferred
  • Master's degree in a related field.
  • Experience with healthcare data and a good understanding of healthcare data standards (e.g., FHIR, HL7).
  • Familiarity with machine learning concepts and LLM fine-tuning processes.
  • Experience with data orchestration tools (e.g., Apache Airflow).
Work Authorization:
    • Must be a US Citizen, Green Card holder, or currently in the US have valid H1B visa

Benefits

Why Join Us?

Joining C the Signs is not just about building AI; it’s about shaping the future of healthcare. If you are a technical leader with an unshakable belief in the power of AI to save lives and the ability to make it happen at scale, this is your opportunity to create a tangible, global impact.

Benefits:

  • Competitive salary and benefits package.
  • Flexible working arrangements (remote or hybrid options available).
  • The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
  • Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity.
  • Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.

Similar Jobs

2 Days Ago
Remote or Hybrid
USA
195K-320K Annually
Expert/Leader
195K-320K Annually
Expert/Leader
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As a Principal Data Engineer, you will design and implement LLM, AI-powered security data platforms, mentor engineers, and drive the adoption of data solutions across teams.
Top Skills: AirflowAWSBigQueryDaskDockerFlinkGCPKafkaKubeflowKubernetesLangchainLlamaindexMlflowMlops ToolsOciPulsarPythonSagemakerSnowflakeSparkVertex Ai
2 Days Ago
Remote or Hybrid
United States
175K-200K Annually
Senior level
175K-200K Annually
Senior level
eCommerce • Fintech • Payments • Software
As a Senior AI/Data Engineer, you will design and maintain scalable data and AI infrastructures, partnering with cross-functional teams to deliver reliable data systems and support AI-driven business capabilities.
Top Skills: AirflowAWSAws GlueAws Step FunctionsCi/CdCircleCIDagsterDbtEcsEmrGithub ActionsGlue CatalogLake FormationLambdaPysparkPythonS3SnowflakeSparkSQLSqsTerraform
Yesterday
Remote
United States
110K-221K Hourly
Senior level
110K-221K Hourly
Senior level
Blockchain • Financial Services • Cryptocurrency • Web3
Build and operate streaming data pipelines and feature stores for real-time model inference. Partner with ML and AI infra teams to define data contracts, improve latency from batch to real-time, ensure data quality and observability, and evaluate emerging streaming and feature-store technologies.
Top Skills: Apache FlinkFeature StoreKafka StreamsPythonRisingwaveScalaSQL

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account