*|MC:SUBJECT|*

OpenAI vs Physicians, Teladoc, and DeepMind Co-Clinician
May 4, 2026

Together with

“Margins have enormous leverage. Moving margins from 2% to 4% is doable, and it’s doable in 12 to 18 months. If a health system can generate $20 million in extra margin, that’s the same as a billion dollars in new net patient revenue, and that would take ten years to do.”

LeanTaaS CEO Mohan Giridharadas

May the 4th – big day for fans of Star Wars and digital health alike!

We’ve got an action packed issue to kick off the new month. LeanTaaS CEO Mohan Giridharadas joined us on the Digital Health Wire Show to unpack a new report on the State of Hospital Financial Health (or lack thereof) and share some pro pointers for creating healthcare companies that are built to last. You can catch the full ep here.

Also on deck: a landmark study in Science that saw AI outperform physicians in clinical reasoning tasks, Teladoc’s Q1 turnaround, and partnerships galore.

Let’s get into it.
Jason

Artificial Intelligence

OpenAI o1 Outperforms Physicians on Clinical Reasoning Tasks

A landmark study in Science found that OpenAI’s o1 series outperformed human physicians at multiple clinical reasoning tasks, but that doesn’t mean it’s time to hang up the scrubs just yet.

Researchers at Harvard and Beth Israel Deaconess Medical Center designed the study to evaluate whether LLMs are ready to do what physicians do on a daily basis: review messy patient charts and use that data to determine diagnosis and next steps.

They evaluated o1 on clinical cases ranging from patient vignettes to second opinions on 76 real-world ED assessments, which included all the noise and incomplete information that clinicians routinely encounter in the EHR.

The refreshingly well-designed study also incorporated a blinded evaluation with two attending physicians at BIDMC and GPT-4.

o1 came to play. On clinical vignettes evaluating management reasoning, o1-preview scored a median of 86%. Not too shabby.

It outperformed GPT-4, humans with GPT-4, and humans with conventional resources like UpToDate – all of which scored below 45%.

The ED cases were even more impressive. o1 offered second opinions about the diagnosis at three points along the patient’s ED journey:

At triage, o1 gave an exact or very close diagnosis in 67% of cases (when information in the record dump was most limited). The two physicians hit 55% and 50%.

o1 still outperformed the physicians when given all the data collected by the end of the ED encounter.

It was only when the physicians were given the most information possible to inform their diagnosis – at the time the patient would have been admitted to the hospital – that the scores finally converged.

The cherry on top? Physician raters couldn’t tell whether the differentials came from o1 or a human. One rater couldn’t tell in 83.6% of cases, the other in 94.4%.

The authors were quick to mention that these results don’t mean AI is ready to replace human physicians. They mean it’s time for rigorous research into how AI can augment care teams, serve as a second opinion, and become a safety layer for clinicians.

The Takeaway

o1 outperforming a couple internists at triage isn’t quite Deep Blue beating Gary Kasparov at chess, but it’s a step in that direction – especially considering OpenAI’s performance jump in just the last week (let alone since o1 launched in 2024).

Any Use Case, Any Specialty

Bunkerhill’s Carebricks platform doesn’t stop at surfacing insights. It translates them into real-world action. From automating prior auths to closing care gaps, Carebricks lets health systems design and deploy AI agents for any clinical or operational need – without adding to anyone’s manual workload. Learn how Carebricks can automate actions for your patients today.

Unlock Better Care With LOLA

Whether you’re looking to augment your team’s capacity or capabilities, Tucuvi’s clinically validated LOLA voice agent is purpose-built to make it happen – and has the success stories to prove it. Hear first-hand from Tucuvi’s customers how LOLA is empowering clinical teams to care for patients while maximizing ROI.

The Wire

Teladoc Turnaround: Teladoc Health posted another down quarter after revenue slid 2% to $614M in Q1, but shares were still up nearly 15% due to some unexpectedly rosy forecasts on the investor call. The BetterHelp turnaround was definitely a bright spot, with the mental health segment now projecting full-year payor revenue between $90M and $105M as it continues to evolve from a strictly direct-to-consumer offering. That’s up from last quarter’s forecast of $75M to $90M, although BetterHelp’s total revenue slid 9% in Q1 so it seems like the extra help from payors hasn’t quite been able to fully offset the bleeding in the DTC business.

Thin Margins Are Here to Stay: LeanTaaS’ new report on The State of Hospital Financial Health delivered the latest evidence that financial pressure is no longer cyclical, with the majority of healthcare organizations reporting consistently razor-thin margins. Nearly 3 in 4 of the health system CFOs surveyed report margins of 2% or lower, driven primarily by reimbursement or payor mix changes (45%) and reduced government funding (42%). The top financial priorities for this year include workforce scheduling (77%) and new tech investments for capacity/workforce utilization (65%). Look no further than our interview with LeanTaaS CEO Mohan Giridharadas for a deeper dive on the data.

ThoroughCare + Withings: ThoroughCare is teaming up with Withings Health Solutions to equip providers with real-time visibility into patient progress via a new integration with Withings’ cellular scales. By eliminating the need for Wi-Fi and automatically transmitting data to care teams, Withings and ThoroughCare help hurdle the barriers to consistent readings that often prevent proactive treatment, particularly for patients in rural areas. The partnership is right on cue as CMS continues to push the industry toward better data sharing.

DeepMind AI Co-Clinician: Google DeepMind shared an update on its new AI co-clinician that’s designed to entail “triadic care” where AI agents support patients under the clinical authority of their physician. The blog post describes the AI co-clinician as the next step in DeepMind’s journey from mastering exam-style tests with MedPaLM, matching physician performance in text-based simulated medical consultations with AMIE, and the culmination of a “long history” of studying how clinicians and AI systems can work together. This clip provides a good look at the tool in action.

New Paradigm at the FDA: The FDA announced a key partnership with Paradigm Health to support its new model aimed at accelerating regulatory review. The new model is already operational in a Phase 2 and Phase 1b trial with Amgen and AstraZeneca, leveraging Paradigm’s Study Conduct platform to enable real-time review by the FDA by automating data collection and streamlining the reporting of safety and efficacy signals. The partnership was also touted as a key step toward broadening trial participation to more patients outside of academic medical centers and by modernizing a fragmented, manual infrastructure into one that can be integrated directly into everyday provider workflows.

2026 Edelman Trust Barometer: Medical misinformation is spreading like wildfire according to the 2026 Edelman Trust Barometer. Over 70% of the 16k respondents in 16 countries believe at least one inaccurate health claim about food, vaccines, or medicine – a rate that’s consistent across education levels, demographic groups, and political lines. At the same time, fewer people are confident in their ability to make informed health decisions despite having new AI tools at their disposal (down 10 percentage points to 51% YoY). Standout stat from the report: 25% of people believe vaccines are used for population control.

Koda Adds to Series A: Advanced care planning platform Koda Health bolted on an additional investment from UPMC to its recent $7M Series A. The platform helps health systems scale goals-of-care conversations and align treatment with patient preferences, enabling education and informed decision-making around serious illnesses that patients are often forced to face without adequate support. Nearly one in four healthcare dollars is spent in the last year of life, and a study with Houston Methodist showed that Koda reduced terminal hospitalizations by 79%, increased hospice use by 51%, and lowered costs by $9k per patient.

Behavioral Nudges Improve Screening: A JAMA Network Open study suggests that behaviorally informed text messages can provide a low-cost strategy to promote colon cancer screening in underserved populations. Nearly 1,300 patients were randomized to receive either a single nurse-led telephone call reminder (usual care) or three automated text messages with various behavioral nudges like hard deadlines, social norms (your provider is waiting for your test kit) and gain-framing (screening may save your life). The end result was 58.9% of the text message group completing screening within 21 days, compared with 49.2% in the usual care group.

Paws Off the Apple Watch: Data presented at the recent Heart Rhythm Society conference from the PAWS study revealed that the Apple Watch captured twice as many arrhythmia events as traditional patch monitors in children aged 6 to 28. Among the 107 children participating, 79% of the Apple Watch’s ECG tracings rated high quality while accurately identifying AFib and supraventricular tachycardia in 73% and 75% of cases, respectively. Check out Cardiac Wire’s stellar interview with Kenneth Civello, MD for a deeper dive on this topic.

UChicago Medicine + Artisight: University of Chicago Medicine is rolling out Artisight’s smart hospital platform system-wide, transforming over 1,800 rooms into entirely ambient care environments. The deployment spans patient care units, post-anesthesia care, and their new freestanding cancer care facility – all of which will be overhauled with computer vision, voice recognition, non-contact vitals, and virtual nursing. The platform is designed to provide a real-time “sixth sense” for care teams, enabling improved documentation and staff coordination with less operational burden.

The Virtual-First Difference at MetroHealth

When the MetroHealth System needed a comprehensive, scalable solution to help its patients access care, it turned to Ovatient’s virtual-first care model. Ovatient isn’t another point solution, it’s a virtual care partnership – integrated clinically, operationally, and technologically with your existing infrastructure. Discover how MetroHealth’s virtual-first approach is keeping patients connected to high-quality care whether they’re at home, at work, or on the go.

Evidence, in the Flow of Care

Heidi brings trusted guidelines and peer-reviewed research directly into clinical workflows so decisions don’t stall care. Clinicians get clear, evidence-based answers without leaving the conversation. No ads, no limits, and no outside interests getting in the way of care. Find out how with Heidi Evidence.

The Resource Wire

Abridge & Availity Redefine Payer-Provider Synergy: Abridge is teaming up with Availity to redefine payer-provider synergy at the point of conversation. The collaboration aligns Abridge’s evidence-aware intelligence with Availity’s real-time health information network to create a first-of-its-kind prior authorization experience, with a shared understanding between patients, providers, and payers. Find out how Abridge and Availity are extending conversational intelligence across the revenue cycle.

8 Keys to Gain an AI Edge in VBC: As value-based care models evolve and competition intensifies, healthcare leaders are seeking practical strategies to improve performance across risk, quality, and financial outcomes. Head over to Navina’s roundup of eight key insights from VBC leaders to learn how aligning AI-powered tools with organizational priorities and clinician needs can help secure a measurable competitive edge in value-based care.

State of Payor Enrollment and Credentialing: Over half of provider orgs are losing revenue due to credentialing delays – with many missing out on over $1M annually. Medallion’s new report unpacks the forces quietly undermining operational and financial performance, and how leaders across the industry are addressing them. Head over to the full report to get insights tailored to your role and org type.

Improving GLP-1 Treatment for the Long Term: Looking to make GLP-1 prescribing safer and more effective long term? Explore Withings’ suite of remote patient monitoring devices, designed to deliver the continuous, clinically relevant insights that care teams need to proactively monitor patients, identify risks early, and intervene with confidence.