The Multimodal Capabilities team at Luma focuses on unlocking advanced capabilities in our foundation models through strategic research into multimodal understanding and generation. This team tackles fundamental research questions around how different modalities can be combined to enable new behaviors and capabilities, working on the open-ended challenges of what makes multimodal AI systems truly powerful and versatile.
ResponsibilitiesCollaborate with the Foundation Models team to identify capability gaps and research solutions
Design datasets, experiments, and methodologies to systematically improve model capabilities across vision, audio, and language
Develop evaluation frameworks and benchmarking approaches for multimodal AI capabilities
Create prototypes and demonstrations that showcase new multimodal capabilities
Strong programming skills in Python and PyTorch
Experience with multimodal data processing pipelines and large-scale dataset curation
Understanding of computer vision, audio processing, and / or natural language processing techniques
(Preferred) Expertise working with interleaved multimodal data
(Preferred) Hands-on experience with Vision Language Models, Audio Language Models, or generative video models
Compensation
The pay range for this position in California is $200,000 - $300,000/yr; however, base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan.
Your application is reviewed by real people.
Top Skills
Similar Jobs
What you need to know about the Boston Tech Scene
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories