PHTI’s new Clinical AI report delivered exactly what we’ve come to expect from their research: top tier industry analysis through the lens of actual stakeholders.
They assembled the A Team for this one. The report was built from an in-person workshop that PHTI convened with senior industry leaders – from health systems and health plans to tech firms and federal agencies – to explore what’s needed to safely scale clinical AI.
- The workshop underscored the policy, reimbursement, and evidence gaps holding back adoption, with several key themes emerging from the discussion around their example use cases (hypertension management and mental health chatbots).
Theme 1: Evidence standards should compare AI to current standards of care and scale with risk.
- That means comparing AI to the care that patients actually receive today rather than idealized care, then having different standards that align with the clinical risk of using the tool.
- Highlight: Evidence should assess whether the full workflow (including multiple models, devices, and human oversight) improves outcomes, not merely model performance.
Theme 2: Performance benchmarks should be based on clinical outcomes, and safety standards should adapt as the evidence grows.
- Ambiguity around what constitutes “good” performance is a persistent barrier. Metrics need to be anchored to specific clinical outcomes instead of vague process measures.
- Highlight: Across both use cases, participants emphasized the need not only to set benchmarks but to set minimum safety floors, which could adjust dynamically over time on the basis of observed outcomes, changing patient risk profiles, & emerging evidence.
Theme 3: New technologies may be initially tested in lower-risk populations, but should scale quickly to high-risk populations to maximize impact.
- Low-risk patients are tempting on-ramps, but AI’s greatest benefits come from reaching the high-need patients, and reaching them carries higher evidence expectations and more clinical risk.
- Highlight: For mental health, engagement and retention are huge barriers to treatment. Participants cautioned that overly restrictive AI deployments risk limiting access and instead emphasized the need for appropriate care routing following LLM engagement.
The Takeaway
Even the most effective clinical AI tools still have plenty of questions to address before adoption can scale, and PHTI just crowdsourced some promising answers straight from the boots-on-the-ground in the healthcare trenches.
