MIT Report Crosses the GenAI Divide

It only takes one look at the key findings from MIT’s GenAI Divide report to see why it made such a big splash this week: 95% of GenAI deployments fail.

MIT knows how to grab headlines. The paper – based on interviews with 150 enterprise execs, a survey of 350 employees, and an analysis of 300 GenAI deployments – highlights a clear chasm between the successful projects and the painful lessons.

  • After $30B+ of GenAI spend across all industries, only 5% of organizations have seen a measurable impact to their top lines. Adoption is high, but transformation is rare. 
  • While general-purpose models like ChatGPT have improved individual productivity, that hasn’t translated to enterprise outcomes. Most “enterprise-grade” systems are stalling in pilots, and only a small fraction actually make it to production.

Why are GenAI pilots failing? The report suggests that it’s not the quality of the models, but the learning gap for both the tools and the organizations that’s causing pilots to fail.

  • Most enterprise tools don’t remember, don’t adapt, and don’t fit into real workflows. This creates “an AI shadow economy” where 90% of employees regularly use general models, yet reject enterprise tools that can’t carry context across sessions.
  • Employees ranked output quality and UX issues among the biggest barriers, which both directly trace back to missing memory and workflow integration.

What’s driving successful deployments? There was a consistent pattern among organizations successfully crossing the GenAI Divide: top buyers treated AI startups less like software vendors and more like business service providers. These orgs:

  • Demanded deep customization aligned to internal processes and data
  • Benchmarked tools on operational outcomes, not model benchmarks
  • Partnered through early-stage failures, treating deployment as co-evolution
  • Sourced AI initiatives from frontline managers, not central labs

There’s always a catch. Most of the pushback on the report was due to its definition of “failure,” which was not having a measurable P&L impact within six months. That definition would make “failures” out of everything from the internet to cloud computing, and underscores why enterprise transformation is measured in years, not months.

The Takeaway

The GenAI growing pains might be worse than expected, but that’s helped startups realize that they need to ditch the SaaS playbook for a new set of rules. In the GenAI era, deployment is a starting line, not a finish line.

Healthcare’s Sci-Fi Future at Epic UGM

Where there’s smoke, there’s fire, and Epic just lit up its sci-fi themed User Group Meeting with enough futuristic new solutions to prove last week’s rumors true – and then some.

The future is now. This year’s event gave us a look at over 160 AI projects currently under development at Epic, including a three-product family set to immediately shake up the industry.

ART is a provider copilot for charting, pre-visit summaries, queuing up orders, and yes – ambient scribing.

  • ART will reportedly be able to provide real-time suggestions during visits, and its highly-anticipated scribe still came as a surprise after Epic revealed that it will be powered by Microsoft when it arrives in early 2026. More on that later.

Emmie is a patient-facing advocate within MyChart that can help with everything from scheduling and reminders to education and navigation.

  • Epic is positioning Emmie as the best place for patients to ask health questions and get answers that are actually grounded in their personal medical history.

Penny is an administrative assistant targeted at revenue cycle management, generating appeal letters, and supporting back-office tasks.

  • There isn’t as much information out there on this one, but Epic doesn’t appear to be shying away from claims and payor workflows.

The EHR is dead, long live the CHR. Judy grabbed even more headlines by announcing that she’s retiring the term “EHR” in favor of “Comprehensive Health Record,” which seems fitting considering the other major announcements that joined the Big Three.

  • Cosmos AI will provide diagnosis and treatment support, as well as discharge planning.
  • MyChart Central will give patients a single login across all sites of care.
  • Flower Pot will expand access to lightweight Epic implementations for smaller practices.

The scribe is real. Now what? Epic’s decision to team up with Microsoft on documentation was pretty unexpected given its 46-year track record of building everything in-house, confirming that the CHR giant would rather bend its core rules than lose market share.

  • Scribes proved how fast health systems would layer on their own AI if Epic couldn’t keep up, and we’ll now have to wait and see if the cost and experience of Epic’s scribe is enough to compete with the flock of ambient AI innovators dedicated to this problem.
  • Epic might own the “operating system,” almost as much as Microsoft owns Windows, but just because MS Paint exists doesn’t mean the world doesn’t need Adobe Photoshop.

The Takeaway

Some call it consolidation. Others call it innovation. Either way, this year’s UGM will probably go down as a key step along Epic’s march toward intergalactic domination. 

Is AI Robbing Physicians of Their Skill? 

A study in The Lancet threw some refreshingly cold water on the AI hype train after finding that healthcare’s shiny new models might be de-skilling physicians.

Here’s the setup. Researchers tracked four Polish health centers that gave their gastroenterologists AI to help spot polyps during colonoscopies before yanking it away after three months.

  • Long story short, the doctors’ ability to detect polyps plummeted 6% below baseline following the AI rugpull.
  • Unassisted polyp detection rates fell from 28.4% before the AI teaser to 22.4% after, raising concerns that relying on AI might rob physicians of hard-won skills. 

Sounds familiar. The findings echo a recent MIT preprint that showed that people who used AI to write essays used less of their brains and had worse recall of their writing than those who mustered up the words on their own.

  • That’s probably not a shocker to anyone that’s used ChatGPT for more than five minutes, but it’s easy to see that it might spell trouble when applied to medicine.
  • If gastroenterologists start leaning on AI to detect polyps, what happens if they lose their ability to detect them without it?

Right idea, wrong question. People were better at mental math before they had calculators, but that doesn’t mean society would be better off without them. The question we have to ask ourselves is, which skills are we willing to lose?

  • Gastroenterologist Dr. Spencer Dorn nails it: AI doesn’t just risk de-skilling doctors in polyp detection, it risks diminishing their overall critical thinking skills.
  • “My real concern is not the technical skills we can afford to lose, but the foundational ones we can’t: critical thinking, sound judgment, and compassionate care. These aren’t just important to preserve – they’re irreplaceable.”

The Takeaway

If doctors keep outsourcing their thinking to AI, it could be a one-way ticket to a world where Dr. GPT is the only one patients can turn to. Seems dystopian, but is it really that bad if it also means better outcomes for those patients?

AI Spotlight on Epic, Abridge, and Oracle 

Epic, Abridge, and Oracle just gave us a year’s worth of blockbuster AI announcements in three days, and at least one of them was more than speculation and old news.

‘Twas the week before UGM, and the rumor-mill has been overheating with reports that Epic might finally launch its own EHR-native scribe at its upcoming User Group Meeting.

  • Over 40% of U.S. hospitals are already on Epic, which means its scribe would have access to one of the biggest distribution channels in healthcare even if its UX and performance aren’t best-in-breed (which they won’t be).
  • That means about 100 ambient AI startups could be about to find out why scribing is a feature – not a product – and the race will be on to differentiate through other capabilities like RCM and specialty-specific tuning.

Abridge doesn’t plan on being commoditized. Less than 24 hours after Epic’s scribe leaked, Abridge unveiled the exact type of solution that’ll define who survives the incumbent squeeze: real-time prior authorization at the point of conversation.

  • Abridge is co-developing the new solution alongside Highmark Health, a Pittsburgh-based payvidor that operates both a multistate payor division and the 14-hospital system Allegheny Health Network.
  • Integrating Abridge’s ambient AI platform across Highmark’s entire organization will allow patients to get approval for necessary treatments before they even leave the office, a perfect example of how “scribes” can be truly transformative beyond just transcripts.

Oracle couldn’t let Epic and Abridge have all the fun. It decided to “usher in a new era of AI-driven health records”… by reintroducing us to the same AI EHR it unveiled last October.

  • Although mostly a PR stunt to grab headlines ahead of UGM, the new EHR includes several features that underscore where the AI puck is heading, including a native scribe, voice-first navigation, and agents to support clinical workflows.
  • These features are also a good list of use cases where startups might not have a lot of juice left to squeeze after EHRs start bringing them in-house (and prior auths just so happen to be the last thing Oracle wants to get its hands dirty with).

The Takeaway

Native scribing is (very likely) on its way to Epic, Abridge is giving patients the gift of time with instant prior auths, and Oracle is banking on voice for the future of EHR navigation. What a week for digital health.

Doximity Ramps Up AI With Pathway Acquisition

Doximity is setting out to prove that it’s more than “LinkedIn for doctors” after snapping up clinical reference AI startup Pathway for $63M. 

Clinical workflows are the new social media… or at least that’s the plot of Doximity’s growth story.

  • Act 1: Doximity’s newsfeed and networking features set the stage for pharma advertising by attracting physicians to the platform.
  • Act 2: Complementary workflow tools like scheduling, telehealth, and Doximity Dialer gave physicians a reason to stick around longer than their news sweep.
  • Act 3: The AI suite took engagement a step further with Doximity GPT and Doximity Scribe, which helped drive quarterly active users to a record 1M physicians in Q1.

Enter Pathway. The Montreal-based startup’s AI helps physicians answer questions at the bedside using information from Pathway Corpus, “one of the largest structured datasets in medicine” that spans nearly every guideline, journal, and landmark trial.

  • Pathway’s cross-linked structure reportedly allows it to understand complex drug interactions and score the strength of medical evidence, such as weighing validated clinical trials more than case studies.
  • The acquisition will bring that same “robustness” to the back-end of Doximity GPT, and the integration is already live for thousands of physician beta testers.

If you can’t beat ‘em, buy ‘em. It’s tough for physicians to see your pharma ads if they’re not using your platform, so Doximity is acquiring its own workflow solutions to keep users from venturing off to use competing products from OpenEvidence or Wolters Kluwer. 

  • Clinicians have also apparently been using Doximity GPT outside of office hours more than Doximity’s other tools, which helps with serving ads around the clock.
  • Doximity’s AI suite and workflow modules already account for over 20% of its ad revenue, and it now expects that share to overtake its newsfeed in the next few years.

The Takeaway

Doximity is looking to make AI the star of its next act, and if OpenEvidence doesn’t want to share its script, then Pathway will have to steal the show.

The Generalist-Specialist Paradox of Medical AI

Technological advances have ushered in an era where many AI models outperform specialists on specific tasks, but AI still lags far behind experts in less controlled settings.

That’s the Generalist-Specialist Paradox of Medical AI laid out in a recent NEJM AI editorial, which paints a picture of a world where AI might soon start redrawing the boundaries of medical specialties as they exist today.

  • AI is already delivering great results on well-defined tasks like interpreting EEGs or CT scans, but it’s still consistently struggling on generalist tasks with less clear boundaries.
  • If that trend continues, the article argues that tasks that used to be in the hands of specialists will be at the fingertips of primary care (just as tasks that used to belong to primary care will now belong to patients).

LLMs don’t care what specialty a case belongs to. They can ingest the full clinical context across visit notes, labs, and imaging to come up with the most probable diagnosis.

  • Breyer Capital Partner Dr. Morgan Cheatham recently made the case that this feature of AI could lead to the collapse of traditional medical specialties as we know them.
  • “Some domains will converge. Others will splinter into new subspecialties defined not by organ systems, but by data fluency, workflow design, or model supervision.”

Not so fast. There’s no doubt that AI will reshape roles, but that doesn’t mean that specialists are about to start offloading everything onto generalists.

  • High-quality care requires more than following AI-friendly guidelines, and specialists incorporate judgment earned through years of experience to deliver effective treatments. LLMs are also still a ways away from replacing anyone’s hip.
  • Primary care providers also aren’t exactly sitting around looking for extra work, and it’s far-fetched to think that they can start taking on specialty care for their ever-growing patient panels.

The Takeaway

AI might be great at well-defined tasks like many seen in specialty care, but we’re still a ways away from having primary care physicians replacing cardiologists.

OpenAI Delivers Largest-Ever Study of Clinical AI

Hot on the heels of launching its HealthBench medical AI benchmark, OpenAI just delivered results from the largest-ever study of clinical AI in actual practice – and let’s just say the future’s looking bright.

40,000 visits, 106 clinicians, 15 clinics. OpenAI went big to get real-world data, equipping Kenya-based primary and urgent care provider Penda Health with AI Consult (GPT4o) clinical decision support within its EHR.

  • The study split 106 Penda clinicians into two even groups (half with AI Consult, half without), then tracked outcomes over a three month period. 

When AI Consult detected a potential error in history, diagnosis, or treatment, it triggered a simple Traffic Light alert.

  • Green – No concerns, no action needed
  • Yellow – Moderate concerns, optional clinician review 
  • Red – Safety-critical concerns, mandatory clinician review

The results were definitely promising. Clinicians using AI Consult saw a:

  • 16% reduction in diagnostic errors
  • 13% reduction in treatment errors
  • 32% reduction history-taking errors

The “training effect” is real. The AI Consult group got significantly better at avoiding common mistakes over time, triggering fewer alerts as the study progressed.

  • Part of that is because Penda took several steps to help along the way, including one-on-one training, peer champions, and performance feedback.
  • It’s also worth noting that there was no recorded harm as a result of AI Consult suggestions, and 100% of the clinicians using it said that it improved their quality of care.

What’s the catch? While AI Consult led to a clear reduction in clinical errors, there was no statistically significant difference in patient-reported outcomes, and clinicians using the copilot saw slightly longer visit times.

The Takeaway

Clinical AI continues to prove itself outside of multiple choice licensing exams / clinical vignettes, and OpenAI just gave us our best evidence yet that general-purpose models can reduce errors in actual patient care.

Microsoft MAI-DxO and the Path to Medical Superintelligence

In an action-packed week to kick off the second half of the year, no story grabbed more headlines than Microsoft’s MAI-DxO proving four times more successful than human doctors at diagnosing complex diseases.

Microsoft is on the path to medical superintelligence… at least according to their excellent blog post outlining its new MAI Diagnostic Orchestrator, better known as MAI‑DxO.

  • MAI-DxO acts like a “virtual panel of physicians” collaborating on a case, orchestrating multiple AI agents with specific roles like forming diagnostic hypotheses, selecting tests, and interpreting results. 
  • It then applies a “debate chain” to arrive at an explainable diagnosis, all while avoiding over-testing to keep costs under control.. 

New breakthroughs require new benchmarks. As AI gets to the point where it’s breezing through multiple choice benchmarks like medical licensing exams, Microsoft decided to introduce SDBench to better simulate routine clinical practice.

  • SDBench deconstructs 304 of the most diagnostically complex NEJM cases, requiring LLMs (and physicians) to begin with an initial presentation, ask follow-up questions, order tests (each with assigned costs), and agree on a diagnosis.

Here’s how MAI-DxO stacked up:

  • MAI-DxO: 85% diagnostic accuracy / $7,200 estimated cost per patient
  • OpenAI o3: 79% / $7,850
  • Gemini 2.5 Pro: 69% / $4,800
  • Claude 4 Opus: 68% / $7,000
  • Llama 4: 55% / $4,000
  • Human Physicians: 20% / $2,950

What’s the catch? The human physicians weren’t allowed to use the internet or any outside help, which probably simulates a deserted island workflow more than routine clinical practice. Each of the participants also happened to be generalists as opposed to specialists, giving another edge to the LLMs. 

The Takeaway

MAI-DxO might have the potential to deliver superhuman diagnostics in constrained settings, but that doesn’t mean it’s ready to replace doctors. As Microsoft pointed out in its own blog post, “clinical roles are much broader than simply making a diagnosis. They need to navigate ambiguity and build trust with patients and their families in a way that AI isn’t set up to do.”

Doximity Accused of Prompt Hacking OpenEvidence

What does a high-flying company like Doximity do when competitors are nipping at its heels? According to OpenEvidence’s new lawsuit, it just politely asks their LLMs to reveal trade secrets. 

Doximity is basically LinkedIn for doctors. It allows physicians to use its networking platform and AI workflow products at no cost, which means the physicians themselves are the product.

  • Doximity generates revenue almost exclusively through pharma advertising, and it turns out that might actually be the best business model around.
  • Out of the dozen publicly traded digital health companies with a market cap over $1B, Doximity is the only one that’s decently profitable.

No good prompt goes unpunished. The crown jewel of Doximity’s AI portfolio is its Doximity GPT workflow assistant, which may or may not leverage proprietary tech acquired by prompting OpenEvidence’s competing model to reveal sensitive information.

  • Although it’s funny to see Doximity get accused of asking OpenEvidence’s AI to literally “write down the secret code,” it doesn’t exactly make for a bulletproof case when the model willingly dishes up an answer.
  • The catch is that OpenEvidence requires users to register using their National Provider ID numbers, and Doximity allegedly impersonated a practicing neurologist to “obtain through theft what they lacked in technical expertise.” Ouch.

It gets worse from there. A separate shareholder lawsuit accused Doximity of inflating its active user base and website engagement data to artificially bolster its advertising revenue.

  • While some investors might be able to stomach a little corporate espionage, they probably won’t look the other way if it turns out Doximity is fudging the numbers.
  • Innocent until proven guilty, but it’s worth noting that nearly identical allegations popped up in a recent short report.

The Takeaway

Doximity has some serious allegations piling up against it, but so far the market has shrugged off the bad news. That could be a sign that investors don’t think the lawsuits will hold up in court, or maybe they just don’t mind when a management team is willing to bend the law to generate some extra shareholder value.

OpenEvidence Partners With JAMA Ahead of Next Raise

“The fastest-growing platform for doctors in history” continues to step on the gas, and OpenEvidence is reportedly on the verge of notching a $3B valuation after inking a deal to bring JAMA Network journals to its AI medical search engine.

The multi-year content agreement will make full-text articles from the American Medical Association’s JAMA, JAMA Network Open, and 11 specialty journals available directly within the OpenEvidence platform.

  • OpenEvidence’s medical search engine helps clinicians make decisions at the point of care, turning natural language queries into structured answers with detailed citations.
  • The model was purpose-built for healthcare using training data from strategic partners like the New England Journal of Medicine, which joined the platform through a similar deal earlier this year.

The Disney+ content strategy has arrived in healthcare. OpenEvidence compares its approach to streaming services that drive subscriptions through exclusive movies.

  • If a physician wants information from top journals to support decision making, they’ll either have to get it straight from the source or use OpenEvidence, just like how anyone who wants to stream Moana needs to go to Disney+.
  • The kicker is that OpenEvidence is available at no cost to verified physicians, and advertising generates all of the revenue. 

The blueprint is working like a charm. OpenEvidence has over 350k doctors using its platform plus another 50k joining each month, and it’s apparently close to raising $100M at a $3B valuation just a few months after closing its $75M Series A.

  • It’s rare to find hockey stick growth in digital health, and OpenEvidence is a good reminder that many areas of healthcare change slowly… then all at once.
  • It also isn’t too surprising to hear that VC’s like Google Ventures and Kleiner Perkins are lining up to fund a company with a similar ad-supported business model to Doximity – one of the only successful healthcare IPOs since the start of the pandemic.

The Takeaway

Content is king, and OpenEvidence is locking in partnerships to make sure its platform is wearing the crown. The results have been speaking for themselves, but healthcare’s genAI streaming wars are just getting started.

Get the top digital health stories right in your inbox

You might also like..

Select All

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Digital Health Wire team

You're all set!