BrightHire is a category-creating, Series B software company with a mission to give everyone the hiring experience they deserve.
We deliver on this mission by transforming the way many of the world’s leading companies build exceptional teams. We created the Interview Intelligence category, and our clients include some of the world’s most innovative companies—from Canva and Zapier to HCA Healthcare—as well as members of the Fortune 100.
Remote - USA
About the RoleYou will be the bridge between breakthrough prototypes and rock-solid production AI. Partnering closely with our Staff Full Stack Engineers, Product, and Design, you will transform early-stage GenAI features into polished, scalable, and governable capabilities that delight users at scale. Your focus will be relentless quality: devising rigorous evaluation frameworks, refining prompts and retrieval pipelines, and optimizing model choices for cost, latency, accuracy, tone, and safety. You will help build the shared AI platform that powers products such as:
- AI Interviewer conversation loops that adapt in real time
- AI Fraud Signals that flag suspicious behavior with minimal false positives
- AI Candidate skills matrices and assistants that surface instant insights
- Modular, model-agnostic architectures that allow us to confidently swap models as new model versions are released.
- Design and own comprehensive eval harnesses that measure accuracy, completeness, style, hallucination rate, bias, and safety across every release
- Tune and iterate on RAG pipelines, prompt chains, conversation loops, provider selections, and fine-tunes until quality bars are met or exceeded
- Build reusable data and evaluation pipelines, a shared semantic layer, and monitoring dashboards that make it easy for product teams to ship reliable AI quickly
- Optimize for cost and latency, continuously benchmarking models and negotiating trade-offs between performance and spend
- Implement robust data governance, lineage, and AIOps practices that satisfy enterprise compliance requirements and support our AI bias audit process
- Collaborate daily with the data science team and cross-functional squads to embed evaluations into CI/CD and ensure every deploy meets our candidate-first standards
- Contribute to a modular, model-agnostic architecture that lets us switch or upgrade LLMs with minimal friction
- Document best practices and share knowledge to raise the bar for AI development across BrightHire
- 5+ years building production machine-learning or NLP systems, including at least 1 year focused on generative-AI or LLM applications
- Demonstrated experience creating automated evaluation suites for LLM outputs (accuracy, safety, bias, tone, style) and using results to guide iterative improvements
- Deep knowledge of prompt engineering, RAG techniques, vector search, embeddings, fine-tuning, and model selection across multiple providers
- Strong Python skills and familiarity with modern LLMOps/MLOps stacks (e.g., Prefect, dbt, vector databases, promptfoo or other evaluation harnesses)
- Comfort working with both structured (SQL) and unstructured (text, audio, embeddings) data, and building pipelines that join them seamlessly
- A mindset for governance: understanding of data privacy, AI ethics, bias mitigation, and enterprise compliance frameworks
- Ability to communicate complex AI trade-offs clearly to engineers, designers, and executives alike
- Bias toward action, curiosity, and a passion for building high-quality user experiences
- High-impact projects in small, autonomous squads where you can lead platform initiatives or dive deep as a specialist
- Thoughtful developer experience with fast CI, 1-click deploys, strong observability, and clean codebases
- Sustainable remote culture: regular working hours, no-meeting Wednesdays, and flexible time off
- Collaborative, kind teammates who value learning and growth
- Work-from-home stipend and quarterly snack deliveries
- Annual learning stipend and generous vacation stipend to recharge
- Competitive compensation, equity, and benefits package
Our company does not discriminate in employment on the basis of race, color, religion, sex (including pregnancy and gender identity), national origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factor.
Top Skills
Similar Jobs
What you need to know about the Boston Tech Scene
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories