AI Learns the Natural History of Human Disease

Clinical decision-making relies on understanding patients’ past health to improve their future health, an impossible task without first understanding how diseases progress over time.

That’s where a new study in Nature suggests AI is ready to help.

It starts with generative pretrained transformers. Researchers built a GPT, dubbed Delphi-2M, to predict the “progression and competing nature of human diseases.” 

  • Delphi-2M was trained on 400k UK Biobank participants (which lean healthier than the average person), and then externally validated on 1.9M Danish patients.
  • The training was designed to predict a patient’s next diagnosis and the time to it, using only data readily available within the EHR: past medical history, age, sex, BMI, and alcohol/smoking status.

How’d it do? The results speak for themselves:

  • Delphi-2M was able to forecast the incidence of over 1,000 diseases with comparable accuracy to existing models that are fine-tuned to predict single diseases.
  • Death could be predicted with eerily impressive accuracy (AUC: 0.97), and the survival curves that it simulated lined up almost perfectly with national mortality statistics.
  • Comorbidities emerged naturally from the training, and Delphi-2M was able to understand the progression from type 2 diabetes to eye disease to nerve damage.
  • Delphi-2M’s ability to predict heart attack and stroke matched established scores like QRisk, and it even outperformed leading biomarker-based AI models.

Better forecasts inform better policies. If policymakers can consult the Oracle of Delphi to see how many people will develop a disease over the next decade, the authors conclude that they’ll also be able to implement better regulations to prepare. 

  • Not a bad theory, assuming models trained on historical data can make forecasts that hold up to evolving treatments and populations (and that politicians act in the best interest of the people:).

The Takeaway

AI is reaching the point where it can predict thousands of diseases as well as the best narrowly focused models, and that could have big implications for everything from early screening to policymaking.

Wolters Kluwer Jumps in the GenAI Ring With UpToDate Expert AI

Right when you think Wolters Kluwer might just let everyone else have all the AI fun, it debuted UpToDate Expert AI to give the world’s most widely used clinical decision support tool a much-needed AI overhaul.

Wolters Kluwer took its time with the launch. The incumbent CDS juggernaut is used by 3M doctors worldwide, so it had plenty of users to disappoint with a hasty roll out.

  • That said, nimble competition has been gaining ground pretty much as fast as it takes to download OpenEvidence from the App Store.
  • The good news is that WK made the most of the extra development time.

Here’s what sets UpToDate Expert AI apart. Unlike general-purpose chatbots, the AI-enhanced version of UpToDate is built exclusively on WK’s peer-reviewed content library.

  • It draws on 30+ years of evidence-based research authored by 7,600 experts, rather than the open web or selective journals.
  • That allows it to quickly answer complex clinical questions, while surfacing all of its sources, assumptions, and step-by-step reasoning directly in the response. Probably safe to assume that also helps with hallucinations.
  • Those answers still manage to be easy to scan at the bedside and will look extremely familiar to any doctor that’s ever read an UpToDate article (or one that’s been reading them for a decade).

The extra time in the oven means that more features are baked in. Wolters Kluwer knows its audience, and UpToDate Expert AI’s biggest leg up on the competition is its fine-tuning for health systems.

  • Enterprise-grade governance, compliance, and workflow integration are all standard out-of-the-box, giving UpToDate Expert AI an advantage for a system-wide implementation over OpenEvidence or Doximity.

The Takeaway

It turns out that the 800-pound clinical support gorilla wasn’t going to let the newcomers eat its lunch forever, and UpToDate Expert AI gives health systems plenty of reasons to keep rolling with Wolters Kluwer.

Co-Creating Confidence: Inside Amigo’s Approach to Building Trustworthy AI Agents

AI moves fast, but trust moves slow. That’s why Digital Health Wire is launching a new series to spotlight the companies taking AI from promise to practice.

First up: Amigo.

No matter how many medical licensing exams and curated case vignettes the latest models conquer, they’ll still need to make it through the proving ground of real clinical practice to get doctors on board.

The biggest challenge for AI in healthcare isn’t building agents that can handle a task, it’s building agents that clinicians can trust to handle those tasks safely – every time, guaranteed.

There’s a massive gap between textbook performance and real-world reliability, and Amigo is giving providers the infrastructure to bridge that gap.

Earning trust takes more than technology. Amigo’s process is just as important as its platform for enabling healthcare orgs to safely design, test, and monitor agents that they can genuinely depend on for their unique clinical and administrative workflows.

Amigo’s approach to building trust stands on four core pillars:

  • Controllability – Clinical teams can define and adjust agent behavior.
  • Performance Validation – High-fidelity patient simulations stress-test readiness.
  • Real-time Observability – There’s full transparency into decision-making.
  • Continuous Alignment – Agents adapt to changing protocols and priorities.

“Good enough” isn’t enough in healthcare. Most industries can get away with using the 80/20 rule to fine-tune their products. If they can improve the experience for 80% of their users, it justifies any shortcomings for the other 20%. Traditional benchmarks might work for customer service, but not when that 20% includes life or death situations.

  • When AI developers chase benchmark scores but ignore outcomes, they miss the actual point of care delivery: making patients healthier. A perfect medical licensing exam is great, but it’s not the same thing as a perfect clinician – or a trustworthy AI agent.
  • Strong benchmark scores can also lure providers into a false sense of security, and it’s tough to notice when performance starts to drift if nobody is on the lookout.

Drift is inevitable, and the current is strong. Even if an AI agent works on day one, there will always be a tendency for performance to slip over time. Clinical guidelines change. New drugs enter the market. Populations evolve. 

Amigo safeguards against this drift with a three-layer framework:

  • The Problem Model – Customers define their specific needs and the “operable neighborhood,” which is basically the set of scenarios that the agent can help with.
  • The Judge – Customers establish their own success criteria, as well as the verification measures to keep track of them. That includes both safety metrics like accuracy and handoff reliability, plus experience metrics like empathy and response time.
  • The Agent – Amigo spins up an agent that can safely tackle the problem at hand, then continuously monitors it against the “success scorecard” to minimize drift and intervene well before it impacts patient care.

How can performance be guaranteed? Simulating success ahead of time. Amigo swaps generic benchmarks for millions of simulated patient conversations to make sure each of its agents are 100% operationally ready before they’re actually deployed.

  • The simulations reflect the real-world scenarios and demographics of each customer’s unique patient population. The goal is to stress-test the agents to their breaking point in a controlled environment, then refine them until they perform reliably under pressure.
  • Amigo intentionally oversamples rare scenarios – like patients with unusual drug interactions – to ensure edge cases don’t slip through. This not only helps keep the agents consistent at scale but also means that they frequently perform better in real practice.

It’s a proven blueprint. Amigo’s strategy for building trust in AI resembles the playbook used in another area with similarly high stakes, high variance, and high skepticism: self-driving cars.

  • Waymo defines the well-charted terrain where its autonomous vehicles (AVs) are designed to operate safely. Amigo maps specific clinical neighborhoods.
  • Waymo simulates edge cases that might take years to encounter in the field before its AVs see any actual street time. Amigo puts its agents to the same test.
  • Waymo’s initial rollout includes safety drivers that can take control when needed. Amigo works with clinicians to refine the accuracy of the Judge.
  • Waymo removes safety drivers as its AVs prove themselves on real trips. Amigo reduces human oversight once clinicians are confident the Judge is calibrated correctly.
  • Waymo moves to similar neighborhoods only after success is consistently demonstrated. Amigo can expand to adjacent use cases where its agents can inherit validated behaviors and guardrails.

Adoption follows confidence. When clinicians co-create the solution to their problems, they’re more comfortable putting it in front of patients. 

  • That confidence usually means leveraging Amigo to automate the workflows that have been weighing them down the most, such as around-the-clock support and care navigation.
  • The agents go beyond providing advice. They can perform actions like ordering tests, updating the EHR, and looping in care teams for complex workflows like triage and medication management.

AI still has a lot to prove. Medicine is complicated, edge cases are everywhere, and lawsuits ain’t cheap. Getting doctors to toss an agent the keys to complex workflows is a tall order, but that’s exactly why Amigo designed its entire platform around getting that buy-in with verifiable evidence every step of the way.

The Takeaway

Clinical AI has the potential to transform healthcare. Fine-tuned AI agents can help eliminate medical errors, keep patients engaged with their care, and allow providers to start carving out competitive moats through their own clinical differentiation.

Doctors aren’t going to arrive at that future by taking a leap of faith. Trust is gained slowly, and can shatter instantly. AI agents will have to earn credibility one workflow at a time, and could lose it all with a single misstep. 

That said, it’s a future worth striving for, and Amigo’s safety-first approach to building trustworthy AI agents is one of the best roadmaps we’ve seen for how to get there.

Nothing gets the magic across better than Amigo’s live walkthrough. Make sure to check out the agents in action by booking a demo on their website.

Innovaccer Acquires Story Health for Agentic Care Augmentation

Innovaccer kicked off a shopping spree instead of chasing an IPO, and virtual specialty care platform Story Health just became the latest startup to get crossed off the acquisition list.

Innovaccer’s been busy. It spent years building the technical infrastructure to make healthcare actually work, and it’s now acquiring the pieces to show what’s possible with that foundation.

  • That includes picking up Humbi AI (actuarial intelligence), Cured (healthcare marketing/CRM), and Pharmacy Quality Solutions (pharma-payor performance tech).
  • It also means equipping more healthcare orgs with its new solutions like Gravity (connects nearly every data input into a single source of truth to scale AI adoption) and Comet (an AI-powered access center with a name so good that Epic had to steal it).

Here come the agents. Story’s cardiovascular health platform is designed to shift care from episodic visits to continuous management that can move the needle on value-based outcomes. 

  • The platform combines AI-driven clinical pathways, advanced medication workflows, and human-led coaching to deliver industry-leading results across heart failure and other chronic conditions. 
  • Innovaccer will be using Story as its first scaffolding to “pioneer agentic care augmentation,” where EHR-integrated AI agents will help specialty care teams with non-clinical tasks and engage patients between visits. 

There’s more on the way. Innovaccer recently revealed that it has “two to three additional acquisitions planned in the coming months,” and that hospital administration and revenue cycle management are both major focus areas.

  • Although Hinge and Omada helped crack open the digital health IPO window, Innovaccer’s business is quickly evolving, and it still has the freedom to make longer-term plays in the private markets.
  • Answering to public shareholders wouldn’t exactly offer Innovaccer any more freedom, and it’s using its unrestricted range of motion to take advantage of private markets that “have never had the kind of depth they have today.”

The Takeaway

We love to see a good crossover story. Innovaccer didn’t just acquire Story to improve outcomes for its patients, it acquired it to scale those outcomes to patients everywhere – and we shouldn’t have to wait long to see another chapter that takes the same playbook to a new specialty.

Doctors Who Use AI Are Viewed Worse by Peers

The research headline of the week belongs to a study out of Johns Hopkins University that found “doctors who use AI are viewed negatively by their peers.”

Clickbait from afar, but far from clickbait. The investigation in npj Digital Medicine surfaced interesting takeaways after randomizing 276 practicing clinicians to evaluate one of three vignettes depicting a physician: using no GenAI (the control), using GenAI as a primary decision-making tool, or using GenAI as a verification tool.

  • Participants rated the clinical skill of the physician using GenAI as a primary decision-making tool as significantly lower than the physician who didn’t use it (3.79 vs. 5.93 control on a 7-point scale). 
  • Framing GenAI as a “second opinion” or verification tool improved the negative perception of clinical skill, but didn’t fully eliminate it (4.99 vs. 5.93 control). 
  • Ironically, while an overreliance on GenAI was viewed as a weakness, the clinicians also recognized AI as beneficial for enhancing medical decision-making. Riddle us that.

Patients seem to agree. A separate study in JAMA Network Open took a look at the patient perspective by randomizing 1.3k adults into four groups that were shown fake ads for family doctors, with one key difference: no mention of AI use (the control), or a reference to the doctors using AI for administrative, diagnostic, or therapeutic purposes (Supplement 1 has all the ads).  

For every AI use case, the doctors were perceived significantly worse on a 5-point scale:

  • less competent – control: 3.85, admin AI: 3.71; diagnostic AI: 3.66; therapeutic AI: 3.58
  • less trustworthy – control: 3.88; admin AI: 3.66; diagnostic AI: 3.62; therapeutic AI: 3.61
  • less empathic – control: 4.00 ; admin AI: 3.80; diagnostic AI: 3.82; therapeutic AI: 3.72

Where’s that leave us? Despite pressure on clinicians to be early AI adopters, using it clearly comes with skepticism from both peers and patients. In other words, AI adoption is getting throttled by not only technological barriers, but also some less-discussed social barriers.

The Takeaway

Medical AI moves at the speed of trust, and these studies highlight the social stigmas that still need to be overcome for patient care to improve as fast as the underlying tech.

MIT Report Crosses the GenAI Divide

It only takes one look at the key findings from MIT’s GenAI Divide report to see why it made such a big splash this week: 95% of GenAI deployments fail.

MIT knows how to grab headlines. The paper – based on interviews with 150 enterprise execs, a survey of 350 employees, and an analysis of 300 GenAI deployments – highlights a clear chasm between the successful projects and the painful lessons.

  • After $30B+ of GenAI spend across all industries, only 5% of organizations have seen a measurable impact to their top lines. Adoption is high, but transformation is rare. 
  • While general-purpose models like ChatGPT have improved individual productivity, that hasn’t translated to enterprise outcomes. Most “enterprise-grade” systems are stalling in pilots, and only a small fraction actually make it to production.

Why are GenAI pilots failing? The report suggests that it’s not the quality of the models, but the learning gap for both the tools and the organizations that’s causing pilots to fail.

  • Most enterprise tools don’t remember, don’t adapt, and don’t fit into real workflows. This creates “an AI shadow economy” where 90% of employees regularly use general models, yet reject enterprise tools that can’t carry context across sessions.
  • Employees ranked output quality and UX issues among the biggest barriers, which both directly trace back to missing memory and workflow integration.

What’s driving successful deployments? There was a consistent pattern among organizations successfully crossing the GenAI Divide: top buyers treated AI startups less like software vendors and more like business service providers. These orgs:

  • Demanded deep customization aligned to internal processes and data
  • Benchmarked tools on operational outcomes, not model benchmarks
  • Partnered through early-stage failures, treating deployment as co-evolution
  • Sourced AI initiatives from frontline managers, not central labs

There’s always a catch. Most of the pushback on the report was due to its definition of “failure,” which was not having a measurable P&L impact within six months. That definition would make “failures” out of everything from the internet to cloud computing, and underscores why enterprise transformation is measured in years, not months.

The Takeaway

The GenAI growing pains might be worse than expected, but that’s helped startups realize that they need to ditch the SaaS playbook for a new set of rules. In the GenAI era, deployment is a starting line, not a finish line.

Healthcare’s Sci-Fi Future at Epic UGM

Where there’s smoke, there’s fire, and Epic just lit up its sci-fi themed User Group Meeting with enough futuristic new solutions to prove last week’s rumors true – and then some.

The future is now. This year’s event gave us a look at over 160 AI projects currently under development at Epic, including a three-product family set to immediately shake up the industry.

ART is a provider copilot for charting, pre-visit summaries, queuing up orders, and yes – ambient scribing.

  • ART will reportedly be able to provide real-time suggestions during visits, and its highly-anticipated scribe still came as a surprise after Epic revealed that it will be powered by Microsoft when it arrives in early 2026. More on that later.

Emmie is a patient-facing advocate within MyChart that can help with everything from scheduling and reminders to education and navigation.

  • Epic is positioning Emmie as the best place for patients to ask health questions and get answers that are actually grounded in their personal medical history.

Penny is an administrative assistant targeted at revenue cycle management, generating appeal letters, and supporting back-office tasks.

  • There isn’t as much information out there on this one, but Epic doesn’t appear to be shying away from claims and payor workflows.

The EHR is dead, long live the CHR. Judy grabbed even more headlines by announcing that she’s retiring the term “EHR” in favor of “Comprehensive Health Record,” which seems fitting considering the other major announcements that joined the Big Three.

  • Cosmos AI will provide diagnosis and treatment support, as well as discharge planning.
  • MyChart Central will give patients a single login across all sites of care.
  • Flower Pot will expand access to lightweight Epic implementations for smaller practices.

The scribe is real. Now what? Epic’s decision to team up with Microsoft on documentation was pretty unexpected given its 46-year track record of building everything in-house, confirming that the CHR giant would rather bend its core rules than lose market share.

  • Scribes proved how fast health systems would layer on their own AI if Epic couldn’t keep up, and we’ll now have to wait and see if the cost and experience of Epic’s scribe is enough to compete with the flock of ambient AI innovators dedicated to this problem.
  • Epic might own the “operating system,” almost as much as Microsoft owns Windows, but just because MS Paint exists doesn’t mean the world doesn’t need Adobe Photoshop.

The Takeaway

Some call it consolidation. Others call it innovation. Either way, this year’s UGM will probably go down as a key step along Epic’s march toward intergalactic domination. 

Is AI Robbing Physicians of Their Skill? 

A study in The Lancet threw some refreshingly cold water on the AI hype train after finding that healthcare’s shiny new models might be de-skilling physicians.

Here’s the setup. Researchers tracked four Polish health centers that gave their gastroenterologists AI to help spot polyps during colonoscopies before yanking it away after three months.

  • Long story short, the doctors’ ability to detect polyps plummeted 6% below baseline following the AI rugpull.
  • Unassisted polyp detection rates fell from 28.4% before the AI teaser to 22.4% after, raising concerns that relying on AI might rob physicians of hard-won skills. 

Sounds familiar. The findings echo a recent MIT preprint that showed that people who used AI to write essays used less of their brains and had worse recall of their writing than those who mustered up the words on their own.

  • That’s probably not a shocker to anyone that’s used ChatGPT for more than five minutes, but it’s easy to see that it might spell trouble when applied to medicine.
  • If gastroenterologists start leaning on AI to detect polyps, what happens if they lose their ability to detect them without it?

Right idea, wrong question. People were better at mental math before they had calculators, but that doesn’t mean society would be better off without them. The question we have to ask ourselves is, which skills are we willing to lose?

  • Gastroenterologist Dr. Spencer Dorn nails it: AI doesn’t just risk de-skilling doctors in polyp detection, it risks diminishing their overall critical thinking skills.
  • “My real concern is not the technical skills we can afford to lose, but the foundational ones we can’t: critical thinking, sound judgment, and compassionate care. These aren’t just important to preserve – they’re irreplaceable.”

The Takeaway

If doctors keep outsourcing their thinking to AI, it could be a one-way ticket to a world where Dr. GPT is the only one patients can turn to. Seems dystopian, but is it really that bad if it also means better outcomes for those patients?

AI Spotlight on Epic, Abridge, and Oracle 

Epic, Abridge, and Oracle just gave us a year’s worth of blockbuster AI announcements in three days, and at least one of them was more than speculation and old news.

‘Twas the week before UGM, and the rumor-mill has been overheating with reports that Epic might finally launch its own EHR-native scribe at its upcoming User Group Meeting.

  • Over 40% of U.S. hospitals are already on Epic, which means its scribe would have access to one of the biggest distribution channels in healthcare even if its UX and performance aren’t best-in-breed (which they won’t be).
  • That means about 100 ambient AI startups could be about to find out why scribing is a feature – not a product – and the race will be on to differentiate through other capabilities like RCM and specialty-specific tuning.

Abridge doesn’t plan on being commoditized. Less than 24 hours after Epic’s scribe leaked, Abridge unveiled the exact type of solution that’ll define who survives the incumbent squeeze: real-time prior authorization at the point of conversation.

  • Abridge is co-developing the new solution alongside Highmark Health, a Pittsburgh-based payvidor that operates both a multistate payor division and the 14-hospital system Allegheny Health Network.
  • Integrating Abridge’s ambient AI platform across Highmark’s entire organization will allow patients to get approval for necessary treatments before they even leave the office, a perfect example of how “scribes” can be truly transformative beyond just transcripts.

Oracle couldn’t let Epic and Abridge have all the fun. It decided to “usher in a new era of AI-driven health records”… by reintroducing us to the same AI EHR it unveiled last October.

  • Although mostly a PR stunt to grab headlines ahead of UGM, the new EHR includes several features that underscore where the AI puck is heading, including a native scribe, voice-first navigation, and agents to support clinical workflows.
  • These features are also a good list of use cases where startups might not have a lot of juice left to squeeze after EHRs start bringing them in-house (and prior auths just so happen to be the last thing Oracle wants to get its hands dirty with).

The Takeaway

Native scribing is (very likely) on its way to Epic, Abridge is giving patients the gift of time with instant prior auths, and Oracle is banking on voice for the future of EHR navigation. What a week for digital health.

Doximity Ramps Up AI With Pathway Acquisition

Doximity is setting out to prove that it’s more than “LinkedIn for doctors” after snapping up clinical reference AI startup Pathway for $63M. 

Clinical workflows are the new social media… or at least that’s the plot of Doximity’s growth story.

  • Act 1: Doximity’s newsfeed and networking features set the stage for pharma advertising by attracting physicians to the platform.
  • Act 2: Complementary workflow tools like scheduling, telehealth, and Doximity Dialer gave physicians a reason to stick around longer than their news sweep.
  • Act 3: The AI suite took engagement a step further with Doximity GPT and Doximity Scribe, which helped drive quarterly active users to a record 1M physicians in Q1.

Enter Pathway. The Montreal-based startup’s AI helps physicians answer questions at the bedside using information from Pathway Corpus, “one of the largest structured datasets in medicine” that spans nearly every guideline, journal, and landmark trial.

  • Pathway’s cross-linked structure reportedly allows it to understand complex drug interactions and score the strength of medical evidence, such as weighing validated clinical trials more than case studies.
  • The acquisition will bring that same “robustness” to the back-end of Doximity GPT, and the integration is already live for thousands of physician beta testers.

If you can’t beat ‘em, buy ‘em. It’s tough for physicians to see your pharma ads if they’re not using your platform, so Doximity is acquiring its own workflow solutions to keep users from venturing off to use competing products from OpenEvidence or Wolters Kluwer. 

  • Clinicians have also apparently been using Doximity GPT outside of office hours more than Doximity’s other tools, which helps with serving ads around the clock.
  • Doximity’s AI suite and workflow modules already account for over 20% of its ad revenue, and it now expects that share to overtake its newsfeed in the next few years.

The Takeaway

Doximity is looking to make AI the star of its next act, and if OpenEvidence doesn’t want to share its script, then Pathway will have to steal the show.

Get the top digital health stories right in your inbox