How to Build Patient Trust in Medical AI

AI might move at the speed of trust, but new research in JAMA Network Open shows that trust only moves at the speed of accuracy.

The study had a solid setup. To determine the factors currently driving patient trust in AI, researchers presented 3,000 U.S. adults with a pair of hypothetical AI-assisted visits for a moderate-risk rash. 

  • Each visit had six randomized attributes, such as whether or not a doctor was present, how well the AI performs relative to human clinicians, and various AI governance mechanisms.

AI performance came out on top by a wide margin. Respondents cared more about how well the AI performs than FDA approval, governance, and even having a doctor in the room.

  • The biggest difference came from AI performing better than a specialist, which increased the likelihood of choosing that visit by 32.5%.
  • AI performing at the same level as a specialist boosted visit preference by 24.8%, slightly more than having AI that performs as well as a general practitioner (19.1%).
  • Having an actual doctor present surprisingly only swayed visit preference by 18.4%.

Governance factors also moved the needle. They just didn’t move it much.

  • FDA approval for the AI increased visit preference by a modest 11.1%.
  • Mayo Clinic AI certifications apparently carry just as much weight – also coming in at 11.1%.
  • Local hospital certifications for the AI only gave visits a 7.8% lift.

AI data quality was important. It just wasn’t as convincing as AI performance. 

  • AI that had nationally representative training data boosted visit preference by 11.9%, but it was interesting to see that disclosing bias in the training data had no effect versus not providing any data details.

The written explanations told the same story. Respondents cited AI performance and clinician involvement as the primary reasons for their choices, with many of them expressing comfort with AI as a tool – but not as a standalone decision-maker.

The Takeaway

Widespread AI adoption requires patient trust, and this study did a great job highlighting the specific areas that should be prioritized for building it.

Microsoft Dragon Copilot Gets AI Upgrades

Microsoft might have had the biggest presence at the biggest health IT conference, and it made sure all the lights in Las Vegas were on Dragon Copilot

Unify. Simplify. Scale. Microsoft’s theme at HIMSS was all about making Dragon Copilot a one-stop-shop for information within clinical workflows. It debuted several new capabilities at the show:

  • Integrated medical content from trusted sources
  • Partner-powered AI apps and agents
  • Proactive ICD‑10 specificity suggestions
  • Expanded role-based experiences for physicians, nurses, and radiologists

Partnering is quicker than building. Rather than developing every Dragon Copilot capability in-house, Microsoft has been leaning on outside partners to round out the platform.

  • Dragon Copilot’s clinical evidence feature is a prime example. It brings medical content and other relevant contextual information in-workflow, all curated through new partnerships with Wolters Kluwer, Elsevier, and other vetted sources.

Microsoft Marketplace fills the gaps. It allows users to add AI partner apps directly into their Dragon Copilot workflows. Picture a modular side panel with insights from folks like: 

  • Regard – surfaces comorbidities and relevant diagnoses 
  • Canary Speech – analyzes voice biomarkers for mental health conditions
  • Humata Health – automates prior authorization processes for clinicians 
  • Atropos – generates personalized real-world evidence 
  • Optum – identifies potential coverage issues and supports claims processing 

All roads lead to scribes. When Microsoft first acquired Nuance for $20M back in 2022, it was its second largest acquisition ever behind LinkedIn, and the core offerings were radiology report automation, dictation, and transcription (with humans still pulling a ton of weight).

  • The product formerly known as Dragon Ambient eXperience is now the backbone of Dragon Copilot, and it’s been adding features at a breakneck pace.
  • Microsoft is looking to make Dragon Copilot everything, everywhere, all at once, and so far new partnerships have been the key to making that happen.

The Takeaway

As every digital health company rushes to add scribing to their platform, the OG scribe is rushing to add everything else. Now it just needs to maintain a unified user eXperience.

Infinite Healthcare, What’s It Worth?

Healthcare is one of the few industries where rising usage is treated as a failure, and a16z just published some solid arguments for why that framing might be completely backwards.

Everybody wants to be healthy. The demand for services that help people get and stay healthy is almost limitless, but the supply has always been limited by clinician time and cost.

  • AI balances the equation. It expands our capacity to provide care and drives down its marginal cost, and a16z makes the case that AI opens the door for us to consume an effectively unlimited amount of proactive care – consistent coaching, continuous monitoring, and earlier interventions.

Health is invaluable. As it stands today, when a payor sets reimbursement for a medical service, the rate assumes a certain volume to assess the overall budget for that service.

  • Price x Quantity = Total Medical Expense
  • If AI sends the quantity of the service through the roof while holding the price constant, the total medical expense would skyrocket.

The question isn’t how to avoid this. It’s “what do we get for it?” 

  • Half of all U.S. health expenditures go to 5% of the population, and AI that helps avoid hospitalizations or acute events can generate huge savings from a few patients.
  • Healthier people are also more productive. If AI can help just 1% of the 160M workers in the U.S. work an additional year because they’re healthy, that’s worth $260B in GDP.

How do you price AI for abundant consumption? In a world with truly proactive AI-driven care, delivering more care earlier is what actually bends the cost curve. Pricing shouldn’t punish usage.

a16z looks to other industries as good examples for healthcare:

  • Telecom used to charge for voice and data by the minute because network capacity was scarce, but pricing shifted to unlimited plans as infrastructure improved. Usage went up significantly, but the total market value grew alongside consumption.
  • Music followed the same arc. iTunes sold songs one at a time. Spotify sold access instead. People started listening to more songs, and consumer surplus expanded.

The Takeaway

As AI expands care capacity and access, consumption naturally increases. Affordable access leads to explosions in usage, and business models shift to subscriptions over per-unit pricing. Other industries have made the transition before, and a16z thinks it might be healthcare’s turn.

Amazon Health Connect Sends AI to the Back Office

If the competition for the back office was already hot, it’s a certified wildfire after last week’s debut of Amazon Health Connect

Amazon is pitching Amazon Connect Health as a purpose-built agentic AI solution for the administrative work that gets in the way of care. That’s definitely not fun to read for all the companies that had the same tagline on their booth at ViVE.

It comes with five capabilities straight out of the box: 

  • Patient verification
  • Appointment scheduling 
  • Pre-visit summaries
  • Ambient documentation
  • Medical coding 

What’s the core use case? AWS Director of Healthcare AI Naji Shafi says it’s the entire patient journey.

  • When a patient calls to book an appointment, Amazon Connect Health answers immediately, confirms their identity, checks their coverage, and lines up the visit while they’re still on the line.
  • Before the visit, it reviews their complete medical history across care settings, then surfaces previsit insights like active conditions or trends that might be relevant to closing care gaps.
  • During the visit, it drafts clinical notes for provider review in real-time, with every detail linked back to the moment in the conversation where it was discussed.
  • After the visit, it generates patient-friendly summaries and the medical codes needed for billing, allowing the visit to be payor-ready and submitted within minutes.

But wait, there’s more. Amazon Connect Health integrates natively with Epic, and connects to 100+ EHRs and 35+ HIEs through data integration partners like Redox.

  • It’s also built entirely on AWS HealthLake, the cloud giant’s FHIR data repository that’s now getting new agentic capabilities to help convert records into standard formats.

Early users love it. Amazon One Medical was the perfect sandbox for polishing Amazon Connect Health in clinical settings before opening it to outside partners. It shows in the results.

  • UC San Diego Health is saving a minute per call, diverting 630 hours a week from patient verification to direct support, and slashed call abandonment by 30%.
  • Netsmart’s EHR supports more than 1,300 community provider orgs, and it saw ambient documentation adoption skyrocket 275% – and better staff retention as a result.

The Takeaway

There were already tons of agentic AI solutions competing to automate healthcare’s administrative waste, and now there’s one that’s bankrolled by the biggest bookstore in human history. It’s a crowded space, but $1 trillion per year is also enough bloat to go around.

PHTI Breaks Down Barriers to Clinical AI

PHTI’s new Clinical AI report delivered exactly what we’ve come to expect from their research: top tier industry analysis through the lens of actual stakeholders.

They assembled the A Team for this one. The report was built from an in-person workshop that PHTI convened with senior industry leaders – from health systems and health plans to tech firms and federal agencies – to explore what’s needed to safely scale clinical AI.

  • The workshop underscored the policy, reimbursement, and evidence gaps holding back adoption, with several key themes emerging from the discussion around their example use cases (hypertension management and mental health chatbots).

Theme 1: Evidence standards should compare AI to current standards of care and scale with risk.

  • That means comparing AI to the care that patients actually receive today rather than idealized care, then having different standards that align with the clinical risk of using the tool.
  • Highlight: Evidence should assess whether the full workflow (including multiple models, devices, and human oversight) improves outcomes, not merely model performance.

Theme 2: Performance benchmarks should be based on clinical outcomes, and safety standards should adapt as the evidence grows.

  • Ambiguity around what constitutes “good” performance is a persistent barrier. Metrics need to be anchored to specific clinical outcomes instead of vague process measures.
  • Highlight: Across both use cases, participants emphasized the need not only to set benchmarks but to set minimum safety floors, which could adjust dynamically over time on the basis of observed outcomes, changing patient risk profiles, & emerging evidence.

Theme 3: New technologies may be initially tested in lower-risk populations, but should scale quickly to high-risk populations to maximize impact.

  • Low-risk patients are tempting on-ramps, but AI’s greatest benefits come from reaching the high-need patients, and reaching them carries higher evidence expectations and more clinical risk.
  • Highlight: For mental health, engagement and retention are huge barriers to treatment. Participants cautioned that overly restrictive AI deployments risk limiting access and instead emphasized the need for appropriate care routing following LLM engagement.

The Takeaway

Even the most effective clinical AI tools still have plenty of questions to address before adoption can scale, and PHTI just crowdsourced some promising answers straight from the boots-on-the-ground in the healthcare trenches.

LLMs Still Struggle With Medical Misinformation 

The Lancet Digital Health just published one of the largest-ever stress tests on medical misinformation in LLMs, and it looks like most models still struggle to separate fact from fiction.

Here’s the setup. Researchers probed 20 LLMs with over 3M prompts containing medical information from three different sources: social media posts, simulated clinical vignettes, or real hospital discharge notes with a single fabricated recommendation inserted.

  • Each prompt was presented in multiple versions, once with neutral wording to establish a baseline, then with a series of variations that were emotionally charged or leading.
  • Ten logical fallacies were also used to test how framing influences model behavior, such as appeals to authority (a physician said…) or popularity (everyone agrees that…).

LLMs love fake news. The susceptibility was shockingly high across all models, with the medical misinformation accepted in 32% of the neutral base prompts.

  • That jumped to 46% when the misinformation was embedded in formal discharge notes, but at least the models were more skeptical of the social media content (9%).

Other findings were more counter-intuitive. Eight of the 10 logical fallacies ended up reducing the misinformation acceptance rate rather than increasing it like the authors expected.

  • Only appeals to authority (+2.9 percentage points above the base prompts) and slippery slope prompts (+2.2pp) increased susceptibility, a relatively small impact considering appeals to popularity slashed it by nearly 20pp.
  • Larger models were generally safer, although the language and phrasing had a far greater influence than the parameter count alone. 
  • It was also surprising to see that the medical models performed worse than the general purpose models, with many having weaker lie detectors despite the specialization.

Improving LLM safety is about more than making bigger models. It’s about knowing how information gets presented by actual humans, and having guardrails in place that hold up even when that information is wrong.

The Takeaway

Benchmark performance isn’t real-world performance, and this study provides another reminder that a model’s ability to separate fact from fiction is often more important than its test scores.

The Patient You Lost Before They Ever Walked In

Thousands of patients are referred for procedures but vanish into the void because no one called them back within 48 hours.

By Shani Fargun, VP Healthcare at StackAI
Sponsored by StackAI

While the headlines at major cardiology conferences focus on AI that can read angiograms or predict arrhythmias, a quieter, unsexy revolution is happening in the back office, and it might be the key to actually using those advanced clinical tools.

The biggest bottleneck in modern cardiology is administrative friction. It’s the death by 1,000 faxes that occurs when a patient is referred for a TAVR, but the pre-op workup is trapped in a PDF from an external hospital. It’s the prior authorization that sits in a queue for weeks because a specific keyword was missing from the submission.

  • According to the AMA, 94% of physicians report that these administrative hurdles lead to delays in accessing necessary care.

Healthcare has a data problem. The industry runs on unstructured data. Referral letters, handwritten call notes, faxed labs, and denial letters make up the bulk of cardiac operations.

  • Nearly 80% of all healthcare data is unstructured and inaccessible to traditional automation. This forces highly trained clinical staff to spend hours acting as data entry clerks rather than treating patients.

Agentic AI is the solution. Agentic AI isn’t a chatbot or a diagnostic model, it’s a digital worker. 

  • Unlike traditional software that waits for a human to input data, Agentic AI can autonomously perform tasks across different systems.

How can agentic workflows change modern practices?

  • Patient Scheduling & Follow-Up  Agents autonomously handle the last mile of care coordination, reaching out to patients to schedule diagnostic testing, confirming procedure dates, and answering routine logistical questions without burdening clinical staff. This directly combats referral leakage, which costs health systems an estimated $971,000 per physician annually. 
  • Automated Prior Auth – Agents cross-reference patient charts against payer-specific guidelines to draft authorization requests that minimize technical denials. Download the free whitepaper of use cases for healthcare here.
  • Referral Velocity – Agents ingest incoming faxes and emails, extract clinical criteria, and draft the patient chart for review: reducing time-to-appointment from weeks to days.

The Takeaway

The future of healthcare starts with better flows. By automating the administrative burden, we allow interventionalists to focus on what they do best: treating patients.

Request a demo to see customized use cases for your organization here.

Epic Shakes Up Scribe Market With AI Charting

The wait is over. Epic’s scribe has arrived, and it’s packing a lot more than ambient notes.

“AI Charting” goes beyond transcriptions. The fully built-in feature not only listens during patient visits and drafts notes, it also queues up orders based on the conversation.

  • The initial release allows clinicians to personalize the note structure using voice commands (Ex. asking to format the history of present illness as a bulleted list).
  • Epic is positioning AI Charting as the killer app for its Art clinical copilot, which also has a pre-visit Insights tool that’s apparently already being used 16M times per month.

Distribution is king. Over 40% of U.S. hospitals are on Epic, and an AJMC study from just last week showed that two-thirds of those hospitals have already adopted ambient AI.

  • AI Charting is breaking onto the scene through one of healthcare’s biggest distribution channels, and Epic has a ton of levers it can pull with pricing and bundling to start stealing share (DAX Copilot, Abridge, and ThinkAndor accounted for ~80% of Epic hospitals in the recent study).
  • Rather than charging a per-user-per-month fee like most ambient AI platforms, STAT reports that Epic plans to have a separate license for AI Charting, with the price varying by org size and utilization to get the tool in as many hands as possible.

It’s time to differentiate. The race is on for established players to prove they can deliver value that Epic’s integrated approach can’t match.

  • That means tackling problems that are too messy for Epic to touch (Abridge bringing real-time prior auths to the point of conversation), or too specialized for it to get right with so many other plates spinning (Nabla raising the bar for AI safety with world models).
  • Epic is working closely with Microsoft to get new features online quickly, but nailing multiple specialties in countless languages could still prove to be a job that’s better suited for a company with a dedicated focus.
  • Epic might own the “operating system” almost as much as Microsoft owns Windows, but just because MS Paint exists doesn’t mean the world doesn’t need Adobe Photoshop.

The Takeaway

Ambient scribes proved how fast health systems would layer on their own AI if Epic couldn’t keep up, and we’ll now have to wait and see if the cost and experience of Epic’s scribe is enough to compete with the flock of ambient AI innovators dedicated to this problem.

Bessemer Venture Partners State of Health AI

Bessemer Venture Partners’ always-stellar State of Healthcare AI report did a great job explaining why we (probably) aren’t in a bubble even though the health AI rocket has hit escape velocity.

AI is more than hype. BVP points to signals from the private markets to make its case. 

M&A activity is surging. Global health tech M&A reached 400 deals in 2025 (up from 350 in 2024), but the strategic rationale matters more than the volume. Healthcare orgs and investors recognize that AI simultaneously drives revenue growth and margin improvement.

  • Prime example: the Smarter Technologies roll up was designed to leverage Thoughtful and SmarterDx’s growth engine and clinical AI platform to drive margin expansion across the Access Healthcare RCM services conglomerate.

VC funding is nearly back to pandemic levels. BVP counted 527 venture deals in 2025 (~$14B total), with the average round size climbing 42% to $29M.

  • AI startups captured 55% of that, up from 37% in 2024. Even more importantly, for every $1 invested in AI companies overall, $0.22 was deployed to healthcare AI startups, outpacing the fair share of 18% of GDP that healthcare spending represents in the U.S.

The question now is, are we in a bubble? BVP has a nuanced answer for why health AI is in a better spot than the Dot Com Bubble.

  • First, AI’s technological shift has spurred the invention of new business models, with the emergence of “AI-services-as-software” companies delivering service-level outcomes (human-quality work) with software-level margins (70%+ gross margins).
  • Second, buyers are now pulling instead of being pushed. While EHRs took 15 years to scale, AI scribes have pulled it off in three. Demonstrable ROI and ease of implementation were key here.

Health AI has an X Factor. New health AI “supernova” startups are bending traditional growth curves entirely. BVP attributes these supernovas’ unprecedented growth to four X Factors.

  • Continuous hyper-growth velocity (not just growth projections)
  • Revenue durability through defensibility
  • Productivity gains that translate to better margins and full-time employee metrics at scale
  • Point solution to platform expansion

Maybe sane valuations, maybe VC mental gymnastics. BVP argues that a supernova with $30M ARR and $1B valuation isn’t overvalued, it has fundamentally different growth dynamics.

  • When you’re growing 6x instead of 2x, you reach $100M ARR in 18 months instead of 36+ months. That compression in time-to-scale commands a premium, and BVP says a 7x revenue multiple for supernovas is justified versus 2-3x for a strong SaaS company.

The Takeaway

Health AI is going supernova, and the explosion might actually be big enough to let the leaders grow into their astronomical valuations.

AI Spots Early Cognitive Decline in Clinical Notes

Early disease detection is entering the AI era, and a new study in npj Digital Medicine shows that autonomous agents can now flag cognitive decline using nothing but clinical notes.

Cognitive decline is difficult to detect. It remains significantly underdiagnosed in routine care, and traditional screening usually requires a dedicated clinician and tests that can take hours. 

  • At the same time, early detection is becoming increasingly important, especially with the recent approval of Alzheimer’s therapies that are most effective when administered early. 

Mass General Brigham might have an answer. Clinical notes contain whispers of cognitive decline that busy clinicians can’t always hear. MGB built a system that listens at scale.

  • These whispers include everything from linguistic shifts and sentence pauses to disorganized narratives and family member concerns. 
  • MGB developed an AI system that scans for these signals in routine clinical documentation, leveraging five specialized agents that critique each other and refine their reasoning.

It worked like a charm. The MGB researchers set their agents loose on over 3,300 clinical notes from 200 anonymized patients, then had human reviewers take their own look.

  • The agents detected cognitive impairment with 91% sensitivity, nearly matching expert-level accuracy – without any human intervention needed after deployment.
  • When the AI and human reviewers disagreed, an independent expert validated the AI’s reasoning 58% of the time – meaning the system was often making sound clinical judgments that initial human review had missed.

The cherry on top? The MGB team open-sourced Pythia alongside the study, enabling any provider org to deploy autonomous prompt optimization for their own AI screening applications.

The Takeaway

LLMs have opened the door to proactive screening at scale, and MGB just provided an excellent proof of concept using AI agents that turn everyday documentation into a chance to catch cognitive decline during the optimal treatment window.

Get the top digital health stories right in your inbox