The gap between benchmark scores and real-world performance has been the theme of the year in AI research, so Google was right on cue with its first prospective clinical trial for AMIE using actual patients.
Meet the Articulate Medical Intelligence Explorer. AMIE is Google’s flagship “medical AI researcher,” and it teamed up with Beth Israel Deaconess Medical Center to gauge performance in real clinical workflows.
- 100 patients completed an AMIE interaction before their primary care visit, with AMIE taking medical histories and equipping patients with potential diagnoses to discuss with their PCP.
- PCPs received the transcript, summary, and AMIE’s management plan prior to the visit. All interactions were monitored live by physicians trained to intervene if safety criteria weren’t met.
AMIE got a gold star. Not only were there zero safety stops across all 100 interactions, patients reported that their attitudes toward AI significantly improved after chatting with AMIE.
- AMIE’s differential included the correct final diagnosis in 90% of cases (per chart review 8 weeks post-encounter), with 75% top-3 accuracy.
- PCPs using AMIE reported increased visit preparedness in 75% of cases, as well as potential behavior change in nearly 60%.
- The quality of AMIE’s differential diagnosis and management plan appropriateness was similar to PCPs, although PCPs won on management plan practicality and cost-effectiveness.
Other findings were less obvious. PCPs had the chart, the physical exam, and the pre-visit transcript, yet AMIE still matched them on differential quality and management safety without taking a single peak at the EHR.
- That speaks to the ceiling (or lack there-of) for structured AI history-taking, and shows that AI is gearing up to improve patient care in more ways than just making predictions.
- The fact that PCPs reported better visit preparedness and potential behavior change in over half of cases also highlights how AI can augment – not just replace – clinical reasoning.
The Takeaway
The distance between the bench and bedside is getting shorter, and Google’s AMIE results suggest that conversational AI in primary care is closer to reality than most people might think.
