MDCalc Jobs

QA Engineer, AI Products

MDCalc

QA Engineer, AI Products

Reposted 20 Hours Ago

Remote

Hiring Remotely in USA

Senior level

Remote

Hiring Remotely in USA

Senior level

As a QA Engineer, you will ensure the quality of AI features, design test strategies, maintain automated pipelines, and collaborate on quality metrics.

The summary above was generated by AI

The Opportunity

Since 2005, MDCalc has been an essential part of the clinician’s workflow to help achieve better patient outcomes. Actively used by more than 65% of physicians worldwide, MDCalc is the most broadly used medical reference – at the point-of-care – for clinical decision tools and content, and one of only four references used by >50% of US HCPs. These evidence-based tools and content are used by millions of medical professionals globally and support 50+ specialties and cover 200+ patient conditions.

To continue to further accelerate and steward this growth, we are expanding the AI product team with a QA Engineer. This role will be critical to MDCalc’s expanded success in continuing to support our millions of clinical users worldwide in taking care of hundreds of millions of patients.

The Role

As a QA Engineer on the AI Products group at MDCalc, you will play a key role in ensuring the quality, reliability, and clinical trustworthiness of MDCalc's AI-powered features. You'll focus on the unique challenges of testing LLM-based systems, where outputs are non-deterministic, correctness is often a spectrum rather than a binary, and regressions can be subtle. You'll be part of a collaborative, fast-moving team that takes pride in delivering software that clinicians trust to care for millions of patients worldwide.

The responsibilities of this individual include the following, but are not limited to:

Design and execute test strategies for LLM-powered features, including prompt regression testing, output evaluation, and hallucination detection
Build and maintain automated evaluation pipelines (eval sets, golden datasets, LLM-as-judge frameworks) to catch quality regressions in non-deterministic outputs
Perform black-box and exploratory testing of MDCalc's AI features across web and mobile, with particular attention to clinical accuracy, safety, and edge cases
Define quality metrics for AI outputs (accuracy, faithfulness, relevance, safety, latency, cost) and establish thresholds for release readiness
Collaborate cross-functionally with engineers, product managers, ML/AI engineers, and clinical reviewers to define what "good" looks like for AI responses
Investigate and triage AI failure modes, distinguishing model issues, prompt issues, retrieval issues, and integration bugs
Participate in team discussions, offering feedback on testability, risks, prompt design, and guardrails
Help develop QA strategies to expand future testing capacity, automation, and evaluation coverage as the AI product surface grows

Your Background

5+ years of experience in software QA, with at least 1 year of hands-on testing of LLM-based or AI/ML-powered features
Strong understanding of QA principles, test case creation/documentation, and best practices for both deterministic and non-deterministic systems
Hands-on experience with LLM tooling and concepts: prompt engineering, RAG systems, evaluation frameworks (e.g., Promptfoo, Braintrust, LangSmith, DeepEval, Ragas, OpenAI Evals), and LLM APIs (OpenAI, Anthropic, etc.)
Experience designing automated qualitative evaluation approaches, including LLM-as-judge, rubric-based scoring, semantic similarity checks, and golden dataset regression testing
Proficiency with test automation tools, with a focus on Playwright
Strong SQL skills for data validation, test data creation, and verifying data integrity across systems
Familiarity with token usage, latency profiling, and cost monitoring as quality signals
Eagerness to learn quickly and a positive, solutions-oriented attitude
Clear and concise communicator, able to surface issues, blockers, and risks effectively when communicating ambiguous or probabilistic failures
Self-motivated, proactive, and able to manage time and priorities independently

What MDCalc offers:

Ability to make a true difference in medicine: MDCalc is the most broadly used medical reference by physicians, used by over 65% of US attending doctors weekly
Medical, Dental, & Vision Coverage, with option to extend to your dependents
Company-sponsored short-term insurance
Fully-paid 8 week parental leave, after 6 months of employment
Company-sponsored 401k, after 3 months of employment
Unlimited vacation for salaried roles - we trust you to take the time you need
Bi-annual company offsites to connect, reflect, and plan together
Work from home monthly stipend
A culture of fun and motivated team members who believe in a greater mission here at MDCalc

Similar Jobs

Pie Insurance

Claims Adjuster, Subrogation

3 Minutes Ago

Easy Apply

Remote

United States

Easy Apply

70K-90K Annually

Junior

70K-90K Annually

Junior

Fintech • Insurance • Machine Learning • Analytics • Financial Services • Automation

Manage a caseload of subrogation claims to maximize recoveries through investigation, liability evaluation, negotiation, and stakeholder communication. Develop action plans, issue notices, document progress, and assist recovery of overpayments across workers' compensation and commercial auto claims.

Top Skills: Collaboration ToolsGoogle SuiteMS Office

Sortly

Principal Revenue Operations

8 Minutes Ago

In-Office or Remote

United States

175K-184K Annually

Senior level

175K-184K Annually

Senior level

Software • App development

Serve as the CRO's second-in-command to align Sales, Marketing, Customer Success, and CX. Own the revenue tech stack, data governance, forecasting, reporting, lead lifecycle, and cross-functional initiatives to drive pipeline integrity, forecasting accuracy, and scalable GTM execution.

Top Skills: AmplitudeAvomaHightouchHubspotLookerSegment

Openly

Information Technology Support Specialist

An Hour Ago

Easy Apply

Remote

United States

Easy Apply

68K-81K Annually

Mid level

68K-81K Annually

Mid level

Insurance

Provide advanced end-user technical support and act as an escalation point. Administer IT infrastructure (user accounts, access control), onboard new hires, manage assets, document processes, train users and junior staff, lead small-medium IT projects, and improve IT support processes across macOS and Windows environments.

Top Skills: Google WorkspaceInfrastructure As CodemacOSNo-Code Automation ToolsOktaScripting LanguagesWindows

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories