Epic Shakes Up Scribe Market With AI Charting

The wait is over. Epic’s scribe has arrived, and it’s packing a lot more than ambient notes.

“AI Charting” goes beyond transcriptions. The fully built-in feature not only listens during patient visits and drafts notes, it also queues up orders based on the conversation.

  • The initial release allows clinicians to personalize the note structure using voice commands (Ex. asking to format the history of present illness as a bulleted list).
  • Epic is positioning AI Charting as the killer app for its Art clinical copilot, which also has a pre-visit Insights tool that’s apparently already being used 16M times per month.

Distribution is king. Over 40% of U.S. hospitals are on Epic, and an AJMC study from just last week showed that two-thirds of those hospitals have already adopted ambient AI.

  • AI Charting is breaking onto the scene through one of healthcare’s biggest distribution channels, and Epic has a ton of levers it can pull with pricing and bundling to start stealing share (DAX Copilot, Abridge, and ThinkAndor accounted for ~80% of Epic hospitals in the recent study).
  • Rather than charging a per-user-per-month fee like most ambient AI platforms, STAT reports that Epic plans to have a separate license for AI Charting, with the price varying by org size and utilization to get the tool in as many hands as possible.

It’s time to differentiate. The race is on for established players to prove they can deliver value that Epic’s integrated approach can’t match.

  • That means tackling problems that are too messy for Epic to touch (Abridge bringing real-time prior auths to the point of conversation), or too specialized for it to get right with so many other plates spinning (Nabla raising the bar for AI safety with world models).
  • Epic is working closely with Microsoft to get new features online quickly, but nailing multiple specialties in countless languages could still prove to be a job that’s better suited for a company with a dedicated focus.
  • Epic might own the “operating system” almost as much as Microsoft owns Windows, but just because MS Paint exists doesn’t mean the world doesn’t need Adobe Photoshop.

The Takeaway

Ambient scribes proved how fast health systems would layer on their own AI if Epic couldn’t keep up, and we’ll now have to wait and see if the cost and experience of Epic’s scribe is enough to compete with the flock of ambient AI innovators dedicated to this problem.

Bessemer Venture Partners State of Health AI

Bessemer Venture Partners’ always-stellar State of Healthcare AI report did a great job explaining why we (probably) aren’t in a bubble even though the health AI rocket has hit escape velocity.

AI is more than hype. BVP points to signals from the private markets to make its case. 

M&A activity is surging. Global health tech M&A reached 400 deals in 2025 (up from 350 in 2024), but the strategic rationale matters more than the volume. Healthcare orgs and investors recognize that AI simultaneously drives revenue growth and margin improvement.

  • Prime example: the Smarter Technologies roll up was designed to leverage Thoughtful and SmarterDx’s growth engine and clinical AI platform to drive margin expansion across the Access Healthcare RCM services conglomerate.

VC funding is nearly back to pandemic levels. BVP counted 527 venture deals in 2025 (~$14B total), with the average round size climbing 42% to $29M.

  • AI startups captured 55% of that, up from 37% in 2024. Even more importantly, for every $1 invested in AI companies overall, $0.22 was deployed to healthcare AI startups, outpacing the fair share of 18% of GDP that healthcare spending represents in the U.S.

The question now is, are we in a bubble? BVP has a nuanced answer for why health AI is in a better spot than the Dot Com Bubble.

  • First, AI’s technological shift has spurred the invention of new business models, with the emergence of “AI-services-as-software” companies delivering service-level outcomes (human-quality work) with software-level margins (70%+ gross margins).
  • Second, buyers are now pulling instead of being pushed. While EHRs took 15 years to scale, AI scribes have pulled it off in three. Demonstrable ROI and ease of implementation were key here.

Health AI has an X Factor. New health AI “supernova” startups are bending traditional growth curves entirely. BVP attributes these supernovas’ unprecedented growth to four X Factors.

  • Continuous hyper-growth velocity (not just growth projections)
  • Revenue durability through defensibility
  • Productivity gains that translate to better margins and full-time employee metrics at scale
  • Point solution to platform expansion

Maybe sane valuations, maybe VC mental gymnastics. BVP argues that a supernova with $30M ARR and $1B valuation isn’t overvalued, it has fundamentally different growth dynamics.

  • When you’re growing 6x instead of 2x, you reach $100M ARR in 18 months instead of 36+ months. That compression in time-to-scale commands a premium, and BVP says a 7x revenue multiple for supernovas is justified versus 2-3x for a strong SaaS company.

The Takeaway

Health AI is going supernova, and the explosion might actually be big enough to let the leaders grow into their astronomical valuations.

AI Spots Early Cognitive Decline in Clinical Notes

Early disease detection is entering the AI era, and a new study in npj Digital Medicine shows that autonomous agents can now flag cognitive decline using nothing but clinical notes.

Cognitive decline is difficult to detect. It remains significantly underdiagnosed in routine care, and traditional screening usually requires a dedicated clinician and tests that can take hours. 

  • At the same time, early detection is becoming increasingly important, especially with the recent approval of Alzheimer’s therapies that are most effective when administered early. 

Mass General Brigham might have an answer. Clinical notes contain whispers of cognitive decline that busy clinicians can’t always hear. MGB built a system that listens at scale.

  • These whispers include everything from linguistic shifts and sentence pauses to disorganized narratives and family member concerns. 
  • MGB developed an AI system that scans for these signals in routine clinical documentation, leveraging five specialized agents that critique each other and refine their reasoning.

It worked like a charm. The MGB researchers set their agents loose on over 3,300 clinical notes from 200 anonymized patients, then had human reviewers take their own look.

  • The agents detected cognitive impairment with 91% sensitivity, nearly matching expert-level accuracy – without any human intervention needed after deployment.
  • When the AI and human reviewers disagreed, an independent expert validated the AI’s reasoning 58% of the time – meaning the system was often making sound clinical judgments that initial human review had missed.

The cherry on top? The MGB team open-sourced Pythia alongside the study, enabling any provider org to deploy autonomous prompt optimization for their own AI screening applications.

The Takeaway

LLMs have opened the door to proactive screening at scale, and MGB just provided an excellent proof of concept using AI agents that turn everyday documentation into a chance to catch cognitive decline during the optimal treatment window.

ARISE Maps the State of Clinical AI

There have probably been hundreds of reports on the medical AI landscape, but there’s only been one State of Clinical AI from the rockstar team at ARISE.

The AI opus delivers the most complete review we’ve seen of a field that’s moving faster than its evaluation practices. It looked at the most influential clinical AI studies from 2025 to answer a trio of important questions:

  • Where does AI meaningfully improve care once it leaves research settings?
  • Where does performance break down?
  • Where do risks remain underexamined?

ARISE brought the heat. The Stanford-Harvard research network produced more highlights than we could count, but here’s a roundup of some of our favorites.

Impressive results in narrow evaluations. AI models have shown “superhuman performance” in research settings, but these results often depend on how narrowly the problem is framed. 

  • In one study, researchers modified standard medical multiple-choice questions so that the correct answer became “none of the other answers.” The clinical reasoning required to solve the question didn’t change. Model performance did. Accuracy dropped sharply across leading AI models, in some cases by over a third.

AI clearly helps prediction at scale. Although diagnostic reasoning was a mixed bag, several studies demonstrated that AI excels at identifying early warning signals from large datasets.

  • A hospital-based study found that a model trained on continuous wearable vital signs predicted patient deterioration up to 24 hours before standard alerts, identifying patients at risk for ICU transfer, cardiac arrest, or death while there was still time to intervene.

Most studies still don’t resemble the reality of healthcare. Clinical work has little to do with answering exam questions, and much to do with reviewing charts, coordinating care, and deciding when not to intervene.

  • A review of 500+ studies found that nearly half of them tested models using medical exam-style questions. Only 5% used real patient data, very few measured whether the models recognized uncertainty, and even fewer examined bias or fairness.

Now what? ARISE offered a few focus areas for 2026 that hit the center of the bullseye for building trust in the latest AI models.  

  • Evaluate models using real-world scenarios to drive evidence-based medicine.
  • Prioritize human-computer interaction design as much as primary outcomes.
  • Measure uncertainty, bias, and harm – especially when it comes to patient-facing AI.

The Takeaway

Healthcare AI has arrived, and ARISE made it clear that innovation won’t be driven by newer models alone. It will depend on whether health systems, researchers, and regulators are willing to apply the same evidence standards to AI that they expect out of any other clinical solution.

Anthropic and OpenAI Set Sights on Providers

Digital health has some fresh competition. Less than a week after OpenAI launched ChatGPT Health, Anthropic crashed the party with the grand debut of Claude for Healthcare

Player 2 has entered the fight. Anthropic’s headlining feature for consumers is identical to ChatGPT Health – the answers are grounded in the patient’s own medical history.

  • Claude for Healthcare lets patients securely upload their health records and app data to unlock the same wide-ranging benefits as ChatGPT Health, such as spotting trends, preparing for visits, interpreting lab results… so on and so forth.
  • The two even share some overlapping partner apps like Function and Apple Health, but the similarities end there. 

Claude for Healthcare gets providers in on the action. Unlike OpenAI’s shiny new patient-facing solution, Claude for Healthcare comes with a suite of “Connectors” that enable it to support previously out-of-reach workflows. The list includes:

  • Prior auth reviews and coverage verifications [CMS Coverage Database]
  • Medical coding and billing accuracy [ICD-10]
  • Provider verification and credentialing [NPI Registry]

OpenAI hasn’t taken any days off. It followed up last week’s big ChatGPT Health news with the launch of ChatGPT for Healthcare – similar names, very different products.

  • ChatGPT for Healthcare is OpenAI’s enterprise solution to the Anthropic problem. It brings new provider-facing capabilities like care path management, referral letter generation, and clinical search (tough break for Doximity and Wolters Kluwer).

The fun doesn’t end there. OpenAI added to its hot streak by picking up Torch, a four-person startup building “a medical memory for AI.” The Information pinned the price tag at $100M. 

  • Torch feeds scattered records into a context engine that connects the dots between visit notes, lab results, wearable data, and any other medical info you can think of. 
  • That pitch rhymes perfectly with ChatGPT Health’s value prop, and the Torch team will now be helping boost the new solution’s medical memory across its inaugural cohort of partner apps.

The Takeaway

What a week for our little corner of the industry. OpenAI and Anthropic are diving in head first, and their tech, ambition, and pockets might even be deeper than the choppy legal waters.

Foundation Models Can Compromise Patient Privacy

Foundation models trained on EHR data hold massive potential for clinical applications, but a new study out of MIT shows that they might have just as much potential to violate patient privacy.

Generalized knowledge makes better predictions. EHR foundation models normally draw on a collection of de-identified patient records to produce their outputs.

  • That’s not a problem on its own, but unintended “memorization” also allows these models to serve answers based on a single record from their training data. 

Therein lies the problem. To quantify the risk of these models revealing sensitive information, MIT researchers developed structured tests to determine how easily an attacker with partial knowledge of a patient – think lab results or demographic details – could extract further identifiable info through targeted prompts.

The tests measured memorization as a function of: 

  • the amount of information an attacker needs to reveal information
  • the risk associated with the revealed information

What did they find? After validating the tests using EHRMamba, an EHR foundation model with publicly available training data, the researchers reached a pair of conclusions that weren’t too surprising to see.

  • The more information attackers have on a patient, the greater their privacy risk.
  • Some patients, particularly those with rare conditions, are more susceptible.

Not all information is harmful. The researchers found that some details, such as a patient’s age or gender, present a relatively lower risk in the event of a data breach. 

  • This info wasn’t very helpful in targeted prompts that probed the model for memorized records, and it isn’t very damaging if the answers reveal it.
  • Other info, such as a rare disease diagnosis, was flagged as significantly more harmful. It posed a higher risk of getting the model to expose patient-specific details (especially in combination with other identifiers), and it can be especially sensitive if revealed through probing.

The Takeaway

EHR foundation models need some degree of memorization to solve complex tasks, but memorizing and revealing patient records is obviously out of the question. The tradeoff between performance and privacy is an ongoing challenge, but MIT just delivered a framework for evaluating some of the risks that can help strike the right balance.

OpenAI Jumps Into Healthcare Arena With ChatGPT Health

If OpenAI wasn’t already a major healthcare player, the launch of ChatGPT Health definitely just made it one.

It’s the gamechanger everyone saw coming. OpenAI even teed up the launch with a report showing that 40M people are already using ChatGPT for healthcare advice on a daily basis. 

ChatGPT Health is about to take that a massive step further. 

Here’s a look at the core features:

  • ChatGPT Health operates inside a dedicated health environment with additional privacy layers (conversations aren’t used for model training, optional two-factor authentication).
  • Users can securely upload their complete medical records (courtesy of b.well).
  • Users can connect apps to inform answers (Apple Health, Function, MyFitnessPal).
  • The model uses longitudinal health data, labs, and visit summaries to help spot trends.

OpenAI is moving beyond general health advice. The extra clinical context gives ChatGPT Health the ability to give better answers at scale, and that’s good news for patients.

A few of the most obvious benefits for patients include:

  • Empowering them to take a more active role in their care.
  • Helping them uncover trends in their overall health.
  • Reducing confusion around test results.
  • Reinforcing care plans between visits.
  • The list could go on for a while.

ChatGPT Health isn’t actually HIPAA compliant. Then again, it doesn’t need to be.

  • Consumer health apps like ChatGPT Health aren’t covered by HIPAA, and to OpenAI’s credit it appears to have done a great job with the necessary disclaimers.
  • The dedicated health environment was also developed with input from 260+ physicians, and it leverages a physician-authored framework for safety, clarity, and escalation.

The question now is, who’s accountable when things go wrong? Millions of patients are about to start showing up to visits armed with advice from ChatGPT Health, which means its AI fingerprints will be all over their questions, concerns, and even clinical decisions. The tech might be ready. The governance isn’t.

  • When ChatGPT Health mentions an unproven treatment and a patient follows through, or interprets a worrying lab value as benign, who carries the liability?
  • OpenAI? The physicians who authored the safety framework? The patient who followed the advice? It’s tough to say, but providers – and their patients – still need a clear answer.

The Takeaway

Everyone wants a doctor in their pocket, and ChatGPT Health just filled that role for millions of patients… even if OpenAI explicitly told them it wasn’t up for the job.

8VC’s Vision for Healthcare AI in America

8VC just dropped its Vision for Healthcare AI in America, and it’s the best roadmap we’ve seen for removing the barriers between AI and its potential to transform medicine.

Great cakes have three layers, maybe four. Before 8VC shared its recipe for how AI can help fix things, it laid out the four main ingredients that it’ll be working with.

  • Level 0: Administrative – AI that supports providers in the back office. Example: AI scheduling agents, scribes.
  • Level 1: Assistive – AI that assists clinicians but doesn’t diagnose, treat, or triage, or prescribe medications to patients. Example: AI coaches, navigators.
  • Level 2: Supervised Autonomous – AI that does all the things that Level 1 doesn’t, with decisions supervised by a clinician. Example: AI medication management.
  • Level 3: Autonomous – AI that diagnoses, treats, triages, or prescribes medications completely on its own. Example: fully-autonomous triage lines.

Now for the vision. Most healthcare AI solutions currently live on Level 0. They’re creating real value for providers, but they aren’t going to steer the Titanic away from the iceberg.

  • 8VC thinks the other levels might, but not unless we remove the legal barriers that are preventing our innovators from innovating.

Level 1. These solutions exist today, but assistive AI care models are being held back by a lack of broadly billable CPT codes for the services they render.

  • Solution: Implement value-based reimbursement for assistive AI care models. 8VC describes a CMMI model with durable codes and case rates, which sounds like something most payors would be lining up to lobby for.

Level 2. All autonomous AI is considered Software as a Medical Device by the FDA, but the current performance bars are set too high. Driving tests don’t need to be F1 races.

  • Solution: Align FDA approval benchmarks with real-world standards, not hypothetical ideals. LumineticsCore is a good example – the FDA required the tool to catch at least 85% of diabetic retinopathy cases, but most ophthalmologists land between 33-77%. 

Level 3. Only a few policy changes are needed to open the door to Level 3 once we get to Level 2, the biggest of which is defining AI as a type of practitioner that’s eligible for reimbursement.

  • Solution: Amend the Social Security Act to allow Medicare reimbursement for licensed AI. As it stands today, even if CMS created a code for a Level 3 service, it would still be illegal for Medicare to pay an AI company instead of the supervising physician.

The Takeaway

AI is going to have to level up if we want to transform healthcare experiences, costs, and ultimately outcomes. 8VC thinks we can get there if we let our builders build, and it even gave us a blueprint for getting out of our own way.

AI Scribes Aren’t Productivity Tools, Yet

The first randomized controlled trials for ambient AI have finally arrived, and NEJM AI just gave us the strongest evidence yet that scribes deliver… minimal time savings.

The first study was a mixed bag. UCLA researchers assigned 238 physicians across 14 specialties to one of two scribes – Microsoft DAX and Nabla – or usual care for two months.

  • Nabla ended up saving about 23 seconds per visit, while DAX shaved off a whopping 5 seconds (which wasn’t even statistically significant).
  • Both scribe groups did however report less burnout and reduced cognitive burden than the usual care controls.

The second study told a similar tale. Physicians at the University of Wisconsin that used Abridge’s AI scribe for 6 weeks trimmed their daily documentation time by 22 minutes.

  • Still not a world-changing difference, but the UW physicians also saw significant positive improvements in work exhaustion and well-being.

But wait, there’s more. While those studies didn’t go as far as to suggest a cause for the lackluster time savings, a separate well-timed study from Navina offered a possible mechanism.

  • Scribes capture clinical conversations. Those conversations only inform a piece of the note, and those notes are only a piece of the workflow.
  • Navina found that incorporating patient medical histories into ambient documentation dramatically improves both note completeness and quality, which also seems like a great way to help physicians avoid lengthy manual chart reviews to fill any remaining gaps.

Then why do scribes get rave reviews? That’s a mystery that’s still up for debate.

  • It’s worth noting that “average time savings” include plenty of physicians who barely used the scribe. UCLA only had about a third of physicians pick up the tools, while UW was close to a best-case scenario at 71%.
  • It’s also possible that physicians enjoy not having to hold the visit in their head until they can finish their note, and getting rid of that burden is as magical as actual time savings.

The Takeaway

Not everything that can be measured matters, and not everything that matters can be measured. AI scribes might not be productivity tools quite yet, but physicians are clearly finding plenty of reasons to love them until they get there – even if more time isn’t one of them.

Incorporating Human Factors Into AI Research

The majority of AI research centers on model performance, but a new paper in JAMIA poses five questions to guide the discussion around how physicians actually interact with AI during diagnosis.

A little reframing goes a long way. As the clinical scope of AI expands alongside its capabilities, the interface between the models and doctors is becoming increasingly important. 

  • Researchers from UCLA and Tufts University point out that this “human-computer interface” is essential to make sure AI is properly integrated into care delivery, serving as the first line of defense against common AI pitfalls like distracting doctors or giving them too much confidence in its answers.

Here’s the questions they came up with, and why they’re each important:

Question 1: What type of information and format should AI present?

  • Why it’s important: Deciding how information gets presented is just as important as deciding what information to present. Format affects doctors’ attention, diagnostic accuracy, and possible interpretive biases.

Question 2: Should AI provide that information immediately, after initial review, or be toggled on and off by the physician?

  • Why it’s important: Immediate information can lead to a biased interpretation, while delayed cues can help physicians maintain their hard-earned diagnostic skills by allowing them to fully engage in each diagnosis.

Question 3: How does AI show its reasoning?

  • Why it’s important: Clear explanations of how AI arrives at a decision can highlight features that were ruled in or out, provide “what if” explanations, and more effectively align with doctors’ clinical reasoning.

Question 4: How does AI affect bias and complacency?

  • Why it’s important: When physicians lean too heavily on AI, they might rely less on their own critical thinking, widening the space for an accurate diagnosis to slip past them.

Question 5: What are the risks of long-term reliance on AI?

  • Why it’s important: Long-term AI reliance could end up eroding learned diagnostic abilities. We recently covered a great study in The Lancet that investigated the topic.

The Takeaway

AI holds enormous potential to improve clinical decision-making, but poor integration could end up doing more harm than good. This paper provides a solid framework to push the field from “Can AI detect disease?” to “How should AI help doctors detect disease without introducing new risks?”

Get the top digital health stories right in your inbox