OpenAI Delivers Largest-Ever Study of Clinical AI

Hot on the heels of launching its HealthBench medical AI benchmark, OpenAI just delivered results from the largest-ever study of clinical AI in actual practice – and let’s just say the future’s looking bright.

40,000 visits, 106 clinicians, 15 clinics. OpenAI went big to get real-world data, equipping Kenya-based primary and urgent care provider Penda Health with AI Consult (GPT4o) clinical decision support within its EHR.

The study split 106 Penda clinicians into two even groups (half with AI Consult, half without), then tracked outcomes over a three month period.

When AI Consult detected a potential error in history, diagnosis, or treatment, it triggered a simple Traffic Light alert.

Green – No concerns, no action needed
Yellow – Moderate concerns, optional clinician review
Red – Safety-critical concerns, mandatory clinician review

The results were definitely promising. Clinicians using AI Consult saw a:

16% reduction in diagnostic errors
13% reduction in treatment errors
32% reduction history-taking errors

The “training effect” is real. The AI Consult group got significantly better at avoiding common mistakes over time, triggering fewer alerts as the study progressed.

Part of that is because Penda took several steps to help along the way, including one-on-one training, peer champions, and performance feedback.

It’s also worth noting that there was no recorded harm as a result of AI Consult suggestions, and 100% of the clinicians using it said that it improved their quality of care.

What’s the catch? While AI Consult led to a clear reduction in clinical errors, there was no statistically significant difference in patient-reported outcomes, and clinicians using the copilot saw slightly longer visit times.

The Takeaway

Clinical AI continues to prove itself outside of multiple choice licensing exams / clinical vignettes, and OpenAI just gave us our best evidence yet that general-purpose models can reduce errors in actual patient care.

Is AI Robbing Physicians of Their Skill? August 18, 2025

A study in The Lancet threw some refreshingly cold water on the AI hype train after finding that healthcare’s shiny new models might be de-skilling physicians. Here’s the setup. Researchers tracked four Polish health centers that gave their gastroenterologists AI to help spot polyps during colonoscopies before yanking it away after three months. Sounds familiar. […]

AI Spotlight on Epic, Abridge, and Oracle August 14, 2025

Epic, Abridge, and Oracle just gave us a year’s worth of blockbuster AI announcements in three days, and at least one of them was more than speculation and old news. ‘Twas the week before UGM, and the rumor-mill has been overheating with reports that Epic might finally launch its own EHR-native scribe at its upcoming […]

Doximity Ramps Up AI With Pathway Acquisition August 11, 2025

Doximity is setting out to prove that it’s more than “LinkedIn for doctors” after snapping up clinical reference AI startup Pathway for $63M. Clinical workflows are the new social media… or at least that’s the plot of Doximity’s growth story. Enter Pathway. The Montreal-based startup’s AI helps physicians answer questions at the bedside using information […]

The Imaging Wire

Cardiac Wire

Get the top digital health stories right in your inbox

You might also like

Is AI Robbing Physicians of Their Skill? August 18, 2025

AI Spotlight on Epic, Abridge, and Oracle August 14, 2025

Doximity Ramps Up AI With Pathway Acquisition August 11, 2025

You might also like..

The Imaging Wire

Cardiac Wire

You're signed up!

You're all set!