Artificial Intelligence Archives - Digital Health Wire

Incorporating Human Factors Into AI Research

Posted on November 2, 2025November 2, 2025 by Jason Barry

The majority of AI research centers on model performance, but a new paper in JAMIA poses five questions to guide the discussion around how physicians actually interact with AI during diagnosis.

A little reframing goes a long way. As the clinical scope of AI expands alongside its capabilities, the interface between the models and doctors is becoming increasingly important.

Researchers from UCLA and Tufts University point out that this “human-computer interface” is essential to make sure AI is properly integrated into care delivery, serving as the first line of defense against common AI pitfalls like distracting doctors or giving them too much confidence in its answers.

Here’s the questions they came up with, and why they’re each important:

Question 1: What type of information and format should AI present?

Why it’s important: Deciding how information gets presented is just as important as deciding what information to present. Format affects doctors’ attention, diagnostic accuracy, and possible interpretive biases.

Question 2: Should AI provide that information immediately, after initial review, or be toggled on and off by the physician?

Why it’s important: Immediate information can lead to a biased interpretation, while delayed cues can help physicians maintain their hard-earned diagnostic skills by allowing them to fully engage in each diagnosis.

Question 3: How does AI show its reasoning?

Why it’s important: Clear explanations of how AI arrives at a decision can highlight features that were ruled in or out, provide “what if” explanations, and more effectively align with doctors’ clinical reasoning.

Question 4: How does AI affect bias and complacency?

Why it’s important: When physicians lean too heavily on AI, they might rely less on their own critical thinking, widening the space for an accurate diagnosis to slip past them.

Question 5: What are the risks of long-term reliance on AI?

Why it’s important: Long-term AI reliance could end up eroding learned diagnostic abilities. We recently covered a great study in The Lancet that investigated the topic.

The Takeaway

AI holds enormous potential to improve clinical decision-making, but poor integration could end up doing more harm than good. This paper provides a solid framework to push the field from “Can AI detect disease?” to “How should AI help doctors detect disease without introducing new risks?”

Menlo Ventures: The State of AI in Healthcare

Posted on October 29, 2025October 29, 2025 by Jason Barry

Of all the AI market overviews that have hit the wire recently, none have generated more buzz than Menlo Venture’s State of AI in Healthcare. One look at the report and it’s easy to see why.

First things first. Here’s a couple high-level callouts before we zoom in on the details:

Healthcare AI spending has already topped $1.4B in 2025 – 22% of healthcare orgs have now implemented domain-specific AI tools, a 7x increase over last year [Chart 1].

85% of all healthcare AI spend is currently flowing into startups (faster cycles, clearer ROI), rather than incumbents (often layering AI on legacy platforms) [Chart 2].

Providers accelerate, payors deliberate. Providers dominate AI adoption in healthcare, especially health systems – supplying $1B of the $1.4B total spending.

Outpatient providers represent $280M, while payors surprisingly contribute just $50M.

The song remains the same. Menlo found that leading health systems are choosing AI based on themes we’ve covered plenty of times before. They prioritize:

Tech maturity – providers prioritize production-ready solutions that perform at scale.
Risk level – tools that don’t directly interface with patients see less scrutiny.
Quick value – a 2025 favorite, rapid ROI and organizational confidence are essential.

What solutions check all the boxes? Two categories account for the lion’s share of AI budgets, in large part because they quickly address acute operational pain points.

Ambient documentation ($600M), no surprise here. This puts it in perspective [Chart 3].
Coding and billing automation ($450M), hard to think of a quicker ROI.

Bonus chart. Here’s the closest we’ll ever get to official ambient AI market share [Chart 4].

IT is good, services are great. Total U.S. healthcare administration spending reaches $740B annually, yet IT spend represents <10% of that. The report has a top tier breakdown [Chart 5].

AI’s frontrunners found success carving into existing IT budgets, but the future victors could be the teams that convert services dollars into software dollars for the first time.

AI offers the ability to automate workflows that have always been “people-intensive” – prior auth, patient engagement, front-office RCM – and Menlo believes 80% of this market is still completely untapped.

The Takeaway

Healthcare’s AI moment is here, and most of its potential has hardly been touched.

AI Learns the Natural History of Human Disease

Posted on October 1, 2025October 1, 2025 by Jason Barry

Clinical decision-making relies on understanding patients’ past health to improve their future health, an impossible task without first understanding how diseases progress over time.

That’s where a new study in Nature suggests AI is ready to help.

It starts with generative pretrained transformers. Researchers built a GPT, dubbed Delphi-2M, to predict the “progression and competing nature of human diseases.”

Delphi-2M was trained on 400k UK Biobank participants (which lean healthier than the average person), and then externally validated on 1.9M Danish patients.

The training was designed to predict a patient’s next diagnosis and the time to it, using only data readily available within the EHR: past medical history, age, sex, BMI, and alcohol/smoking status.

How’d it do? The results speak for themselves:

Delphi-2M was able to forecast the incidence of over 1,000 diseases with comparable accuracy to existing models that are fine-tuned to predict single diseases.

Death could be predicted with eerily impressive accuracy (AUC: 0.97), and the survival curves that it simulated lined up almost perfectly with national mortality statistics.

Comorbidities emerged naturally from the training, and Delphi-2M was able to understand the progression from type 2 diabetes to eye disease to nerve damage.

Delphi-2M’s ability to predict heart attack and stroke matched established scores like QRisk, and it even outperformed leading biomarker-based AI models.

Better forecasts inform better policies. If policymakers can consult the Oracle of Delphi to see how many people will develop a disease over the next decade, the authors conclude that they’ll also be able to implement better regulations to prepare.

Not a bad theory, assuming models trained on historical data can make forecasts that hold up to evolving treatments and populations (and that politicians act in the best interest of the people:).

The Takeaway

AI is reaching the point where it can predict thousands of diseases as well as the best narrowly focused models, and that could have big implications for everything from early screening to policymaking.

Wolters Kluwer Jumps in the GenAI Ring With UpToDate Expert AI

Posted on September 24, 2025September 24, 2025 by Jason Barry

Right when you think Wolters Kluwer might just let everyone else have all the AI fun, it debuted UpToDate Expert AI to give the world’s most widely used clinical decision support tool a much-needed AI overhaul.

Wolters Kluwer took its time with the launch. The incumbent CDS juggernaut is used by 3M doctors worldwide, so it had plenty of users to disappoint with a hasty roll out.

That said, nimble competition has been gaining ground pretty much as fast as it takes to download OpenEvidence from the App Store.

The good news is that WK made the most of the extra development time.

Here’s what sets UpToDate Expert AI apart. Unlike general-purpose chatbots, the AI-enhanced version of UpToDate is built exclusively on WK’s peer-reviewed content library.

It draws on 30+ years of evidence-based research authored by 7,600 experts, rather than the open web or selective journals.

That allows it to quickly answer complex clinical questions, while surfacing all of its sources, assumptions, and step-by-step reasoning directly in the response. Probably safe to assume that also helps with hallucinations.

Those answers still manage to be easy to scan at the bedside and will look extremely familiar to any doctor that’s ever read an UpToDate article (or one that’s been reading them for a decade).

The extra time in the oven means that more features are baked in. Wolters Kluwer knows its audience, and UpToDate Expert AI’s biggest leg up on the competition is its fine-tuning for health systems.

Enterprise-grade governance, compliance, and workflow integration are all standard out-of-the-box, giving UpToDate Expert AI an advantage for a system-wide implementation over OpenEvidence or Doximity.

The Takeaway

It turns out that the 800-pound clinical support gorilla wasn’t going to let the newcomers eat its lunch forever, and UpToDate Expert AI gives health systems plenty of reasons to keep rolling with Wolters Kluwer.

Co-Creating Confidence: Inside Amigo’s Approach to Building Trustworthy AI Agents

Posted on September 22, 2025October 29, 2025 by Jason Barry

AI moves fast, but trust moves slow. That’s why Digital Health Wire is launching a new series to spotlight the companies taking AI from promise to practice.

First up: Amigo.

No matter how many medical licensing exams and curated case vignettes the latest models conquer, they’ll still need to make it through the proving ground of real clinical practice to get doctors on board.

The biggest challenge for AI in healthcare isn’t building agents that can handle a task, it’s building agents that clinicians can trust to handle those tasks safely – every time, guaranteed.

There’s a massive gap between textbook performance and real-world reliability, and Amigo is giving providers the infrastructure to bridge that gap.

Earning trust takes more than technology. Amigo’s process is just as important as its platform for enabling healthcare orgs to safely design, test, and monitor agents that they can genuinely depend on for their unique clinical and administrative workflows.

Amigo’s approach to building trust stands on four core pillars:

Controllability – Clinical teams can define and adjust agent behavior.
Performance Validation – High-fidelity patient simulations stress-test readiness.
Real-time Observability – There’s full transparency into decision-making.
Continuous Alignment – Agents adapt to changing protocols and priorities.

“Good enough” isn’t enough in healthcare. Most industries can get away with using the 80/20 rule to fine-tune their products. If they can improve the experience for 80% of their users, it justifies any shortcomings for the other 20%. Traditional benchmarks might work for customer service, but not when that 20% includes life or death situations.

When AI developers chase benchmark scores but ignore outcomes, they miss the actual point of care delivery: making patients healthier. A perfect medical licensing exam is great, but it’s not the same thing as a perfect clinician – or a trustworthy AI agent.

Strong benchmark scores can also lure providers into a false sense of security, and it’s tough to notice when performance starts to drift if nobody is on the lookout.

Drift is inevitable, and the current is strong. Even if an AI agent works on day one, there will always be a tendency for performance to slip over time. Clinical guidelines change. New drugs enter the market. Populations evolve.

Amigo safeguards against this drift with a three-layer framework:

The Problem Model – Customers define their specific needs and the “operable neighborhood,” which is basically the set of scenarios that the agent can help with.

The Judge – Customers establish their own success criteria, as well as the verification measures to keep track of them. That includes both safety metrics like accuracy and handoff reliability, plus experience metrics like empathy and response time.

The Agent – Amigo spins up an agent that can safely tackle the problem at hand, then continuously monitors it against the “success scorecard” to minimize drift and intervene well before it impacts patient care.

How can performance be guaranteed? Simulating success ahead of time. Amigo swaps generic benchmarks for millions of simulated patient conversations to make sure each of its agents are 100% operationally ready before they’re actually deployed.

The simulations reflect the real-world scenarios and demographics of each customer’s unique patient population. The goal is to stress-test the agents to their breaking point in a controlled environment, then refine them until they perform reliably under pressure.

Amigo intentionally oversamples rare scenarios – like patients with unusual drug interactions – to ensure edge cases don’t slip through. This not only helps keep the agents consistent at scale but also means that they frequently perform better in real practice.

It’s a proven blueprint. Amigo’s strategy for building trust in AI resembles the playbook used in another area with similarly high stakes, high variance, and high skepticism: self-driving cars.

Waymo defines the well-charted terrain where its autonomous vehicles (AVs) are designed to operate safely. Amigo maps specific clinical neighborhoods.

Waymo simulates edge cases that might take years to encounter in the field before its AVs see any actual street time. Amigo puts its agents to the same test.

Waymo’s initial rollout includes safety drivers that can take control when needed. Amigo works with clinicians to refine the accuracy of the Judge.

Waymo removes safety drivers as its AVs prove themselves on real trips. Amigo reduces human oversight once clinicians are confident the Judge is calibrated correctly.

Waymo moves to similar neighborhoods only after success is consistently demonstrated. Amigo can expand to adjacent use cases where its agents can inherit validated behaviors and guardrails.

Adoption follows confidence. When clinicians co-create the solution to their problems, they’re more comfortable putting it in front of patients.

That confidence usually means leveraging Amigo to automate the workflows that have been weighing them down the most, such as around-the-clock support and care navigation.

The agents go beyond providing advice. They can perform actions like ordering tests, updating the EHR, and looping in care teams for complex workflows like triage and medication management.

AI still has a lot to prove. Medicine is complicated, edge cases are everywhere, and lawsuits ain’t cheap. Getting doctors to toss an agent the keys to complex workflows is a tall order, but that’s exactly why Amigo designed its entire platform around getting that buy-in with verifiable evidence every step of the way.

The Takeaway

Clinical AI has the potential to transform healthcare. Fine-tuned AI agents can help eliminate medical errors, keep patients engaged with their care, and allow providers to start carving out competitive moats through their own clinical differentiation.

Doctors aren’t going to arrive at that future by taking a leap of faith. Trust is gained slowly, and can shatter instantly. AI agents will have to earn credibility one workflow at a time, and could lose it all with a single misstep.

That said, it’s a future worth striving for, and Amigo’s safety-first approach to building trustworthy AI agents is one of the best roadmaps we’ve seen for how to get there.

Nothing gets the magic across better than Amigo’s live walkthrough. Make sure to check out the agents in action by booking a demo on their website.

Innovaccer Acquires Story Health for Agentic Care Augmentation

Posted on September 21, 2025September 22, 2025 by Jason Barry

Innovaccer kicked off a shopping spree instead of chasing an IPO, and virtual specialty care platform Story Health just became the latest startup to get crossed off the acquisition list.

Innovaccer’s been busy. It spent years building the technical infrastructure to make healthcare actually work, and it’s now acquiring the pieces to show what’s possible with that foundation.

That includes picking up Humbi AI (actuarial intelligence), Cured (healthcare marketing/CRM), and Pharmacy Quality Solutions (pharma-payor performance tech).

It also means equipping more healthcare orgs with its new solutions like Gravity (connects nearly every data input into a single source of truth to scale AI adoption) and Comet (an AI-powered access center with a name so good that Epic had to steal it).

Here come the agents. Story’s cardiovascular health platform is designed to shift care from episodic visits to continuous management that can move the needle on value-based outcomes.

The platform combines AI-driven clinical pathways, advanced medication workflows, and human-led coaching to deliver industry-leading results across heart failure and other chronic conditions.

Innovaccer will be using Story as its first scaffolding to “pioneer agentic care augmentation,” where EHR-integrated AI agents will help specialty care teams with non-clinical tasks and engage patients between visits.

There’s more on the way. Innovaccer recently revealed that it has “two to three additional acquisitions planned in the coming months,” and that hospital administration and revenue cycle management are both major focus areas.

Although Hinge and Omada helped crack open the digital health IPO window, Innovaccer’s business is quickly evolving, and it still has the freedom to make longer-term plays in the private markets.

Answering to public shareholders wouldn’t exactly offer Innovaccer any more freedom, and it’s using its unrestricted range of motion to take advantage of private markets that “have never had the kind of depth they have today.”

The Takeaway

We love to see a good crossover story. Innovaccer didn’t just acquire Story to improve outcomes for its patients, it acquired it to scale those outcomes to patients everywhere – and we shouldn’t have to wait long to see another chapter that takes the same playbook to a new specialty.

Doctors Who Use AI Are Viewed Worse by Peers

Posted on September 15, 2025September 15, 2025 by Jason Barry

The research headline of the week belongs to a study out of Johns Hopkins University that found “doctors who use AI are viewed negatively by their peers.”

Clickbait from afar, but far from clickbait. The investigation in npj Digital Medicine surfaced interesting takeaways after randomizing 276 practicing clinicians to evaluate one of three vignettes depicting a physician: using no GenAI (the control), using GenAI as a primary decision-making tool, or using GenAI as a verification tool.

Participants rated the clinical skill of the physician using GenAI as a primary decision-making tool as significantly lower than the physician who didn’t use it (3.79 vs. 5.93 control on a 7-point scale).

Framing GenAI as a “second opinion” or verification tool improved the negative perception of clinical skill, but didn’t fully eliminate it (4.99 vs. 5.93 control).

Ironically, while an overreliance on GenAI was viewed as a weakness, the clinicians also recognized AI as beneficial for enhancing medical decision-making. Riddle us that.

Patients seem to agree. A separate study in JAMA Network Open took a look at the patient perspective by randomizing 1.3k adults into four groups that were shown fake ads for family doctors, with one key difference: no mention of AI use (the control), or a reference to the doctors using AI for administrative, diagnostic, or therapeutic purposes (Supplement 1 has all the ads).

For every AI use case, the doctors were perceived significantly worse on a 5-point scale:

less competent – control: 3.85, admin AI: 3.71; diagnostic AI: 3.66; therapeutic AI: 3.58
less trustworthy – control: 3.88; admin AI: 3.66; diagnostic AI: 3.62; therapeutic AI: 3.61
less empathic – control: 4.00 ; admin AI: 3.80; diagnostic AI: 3.82; therapeutic AI: 3.72

Where’s that leave us? Despite pressure on clinicians to be early AI adopters, using it clearly comes with skepticism from both peers and patients. In other words, AI adoption is getting throttled by not only technological barriers, but also some less-discussed social barriers.

The Takeaway

Medical AI moves at the speed of trust, and these studies highlight the social stigmas that still need to be overcome for patient care to improve as fast as the underlying tech.

MIT Report Crosses the GenAI Divide

Posted on August 28, 2025August 28, 2025 by Jason Barry

It only takes one look at the key findings from MIT’s GenAI Divide report to see why it made such a big splash this week: 95% of GenAI deployments fail.

MIT knows how to grab headlines. The paper – based on interviews with 150 enterprise execs, a survey of 350 employees, and an analysis of 300 GenAI deployments – highlights a clear chasm between the successful projects and the painful lessons.

After $30B+ of GenAI spend across all industries, only 5% of organizations have seen a measurable impact to their top lines. Adoption is high, but transformation is rare.

While general-purpose models like ChatGPT have improved individual productivity, that hasn’t translated to enterprise outcomes. Most “enterprise-grade” systems are stalling in pilots, and only a small fraction actually make it to production.

Why are GenAI pilots failing? The report suggests that it’s not the quality of the models, but the learning gap for both the tools and the organizations that’s causing pilots to fail.

Most enterprise tools don’t remember, don’t adapt, and don’t fit into real workflows. This creates “an AI shadow economy” where 90% of employees regularly use general models, yet reject enterprise tools that can’t carry context across sessions.

Employees ranked output quality and UX issues among the biggest barriers, which both directly trace back to missing memory and workflow integration.

What’s driving successful deployments? There was a consistent pattern among organizations successfully crossing the GenAI Divide: top buyers treated AI startups less like software vendors and more like business service providers. These orgs:

Demanded deep customization aligned to internal processes and data
Benchmarked tools on operational outcomes, not model benchmarks
Partnered through early-stage failures, treating deployment as co-evolution
Sourced AI initiatives from frontline managers, not central labs

There’s always a catch. Most of the pushback on the report was due to its definition of “failure,” which was not having a measurable P&L impact within six months. That definition would make “failures” out of everything from the internet to cloud computing, and underscores why enterprise transformation is measured in years, not months.

The Takeaway

The GenAI growing pains might be worse than expected, but that’s helped startups realize that they need to ditch the SaaS playbook for a new set of rules. In the GenAI era, deployment is a starting line, not a finish line.

Healthcare’s Sci-Fi Future at Epic UGM

Posted on August 20, 2025August 20, 2025 by Jason Barry

Where there’s smoke, there’s fire, and Epic just lit up its sci-fi themed User Group Meeting with enough futuristic new solutions to prove last week’s rumors true – and then some.

The future is now. This year’s event gave us a look at over 160 AI projects currently under development at Epic, including a three-product family set to immediately shake up the industry.

ART is a provider copilot for charting, pre-visit summaries, queuing up orders, and yes – ambient scribing.

ART will reportedly be able to provide real-time suggestions during visits, and its highly-anticipated scribe still came as a surprise after Epic revealed that it will be powered by Microsoft when it arrives in early 2026. More on that later.

Emmie is a patient-facing advocate within MyChart that can help with everything from scheduling and reminders to education and navigation.

Epic is positioning Emmie as the best place for patients to ask health questions and get answers that are actually grounded in their personal medical history.

Penny is an administrative assistant targeted at revenue cycle management, generating appeal letters, and supporting back-office tasks.

There isn’t as much information out there on this one, but Epic doesn’t appear to be shying away from claims and payor workflows.

The EHR is dead, long live the CHR. Judy grabbed even more headlines by announcing that she’s retiring the term “EHR” in favor of “Comprehensive Health Record,” which seems fitting considering the other major announcements that joined the Big Three.

Cosmos AI will provide diagnosis and treatment support, as well as discharge planning.
MyChart Central will give patients a single login across all sites of care.
Flower Pot will expand access to lightweight Epic implementations for smaller practices.

The scribe is real. Now what? Epic’s decision to team up with Microsoft on documentation was pretty unexpected given its 46-year track record of building everything in-house, confirming that the CHR giant would rather bend its core rules than lose market share.

Scribes proved how fast health systems would layer on their own AI if Epic couldn’t keep up, and we’ll now have to wait and see if the cost and experience of Epic’s scribe is enough to compete with the flock of ambient AI innovators dedicated to this problem.

Epic might own the “operating system,” almost as much as Microsoft owns Windows, but just because MS Paint exists doesn’t mean the world doesn’t need Adobe Photoshop.

The Takeaway

Some call it consolidation. Others call it innovation. Either way, this year’s UGM will probably go down as a key step along Epic’s march toward intergalactic domination.

Is AI Robbing Physicians of Their Skill?

Posted on August 15, 2025August 15, 2025 by Jason Barry

A study in The Lancet threw some refreshingly cold water on the AI hype train after finding that healthcare’s shiny new models might be de-skilling physicians.

Here’s the setup. Researchers tracked four Polish health centers that gave their gastroenterologists AI to help spot polyps during colonoscopies before yanking it away after three months.

Long story short, the doctors’ ability to detect polyps plummeted 6% below baseline following the AI rugpull.

Unassisted polyp detection rates fell from 28.4% before the AI teaser to 22.4% after, raising concerns that relying on AI might rob physicians of hard-won skills.

Sounds familiar. The findings echo a recent MIT preprint that showed that people who used AI to write essays used less of their brains and had worse recall of their writing than those who mustered up the words on their own.

That’s probably not a shocker to anyone that’s used ChatGPT for more than five minutes, but it’s easy to see that it might spell trouble when applied to medicine.

If gastroenterologists start leaning on AI to detect polyps, what happens if they lose their ability to detect them without it?

Right idea, wrong question. People were better at mental math before they had calculators, but that doesn’t mean society would be better off without them. The question we have to ask ourselves is, which skills are we willing to lose?

Gastroenterologist Dr. Spencer Dorn nails it: AI doesn’t just risk de-skilling doctors in polyp detection, it risks diminishing their overall critical thinking skills.

“My real concern is not the technical skills we can afford to lose, but the foundational ones we can’t: critical thinking, sound judgment, and compassionate care. These aren’t just important to preserve – they’re irreplaceable.”

The Takeaway

If doctors keep outsourcing their thinking to AI, it could be a one-way ticket to a world where Dr. GPT is the only one patients can turn to. Seems dystopian, but is it really that bad if it also means better outcomes for those patients?

Get the top digital health stories right in your inbox