The Healthcare AI Adoption Index

Bessemer Venture Partners’ market reports are always some of the best in the business, but its recent Healthcare AI Adoption Index might just be its finest work yet.

The Healthcare AI Adoption Index is based on survey data from 400+ execs across Payors, Providers, and Pharma – breaking down how buyers are approaching GenAI applications, what jobs-to-be-done they’re prioritizing, and where their projects sit on the adoption curve.

Here’s a look at what they found:

  • AI is high on the agenda across the board, with AI budgets outpacing IT spend in each of the three segments. Over half (54%) are seeing ROI within the first 12 months.
  • Only a third of AI pilots end up reaching production, held back by everything from security and data readiness to integration costs and limited in-house expertise.
  • Despite all the trendsetters we cover on a weekly basis, only 15% of active AI projects are being driven by startups. The rest are being built internally or led by the usual suspects like major EHRs and Big Tech.
  • That said, 48% of executives say they prefer working with startups over incumbents, and Bessemer encourages founders to co-develop solutions with their customers and lean in on partnerships that provide access to distribution, proprietary datasets, and credibility.

The highlight of the report was Bessemer’s analysis of the 59 jobs-to-be-done as potential use cases for AI. 

  • Of the 22 jobs-to-be-done for Payors (claims, network, member, pricing), 19 jobs for Pharma (preclinical, clinical, marketing, sales), and 18 jobs for Providers (care delivery, RCM) – 45% are still in the ideation or proof of concept phase.
  • Providers are ahead in POC experimentation, while most Payor and Pharma use cases remain in the ideation phase. Here’s a beautiful look at where different use cases stand.

Bessemer topped off its analysis with the debut of its AI Dx Index, which factors in market size, urgency, and current adoption to help startups map and prioritize AI use cases. One of the best graphics so far this year.

The Takeaway

Healthcare’s AI-powered paradigm shift is kicking into overdrive, and Bessemer just delivered one of the most comprehensive views of where the puck is going that we’ve seen to date.

K Health’s AI Clinical Recommendations Rival Doctors in Real-World Setting

Real-world comparisons of AI recommendations and doctors’ clinical decisions have been few and far between, but a new study in the Annals of Internal Medicine gave us a great look at how performance stacks up with actual patients.

The early verdict? AI came out on top, but that doesn’t mean doctors should pack their bags quite yet.

Researchers from Cedars-Sinai and Tel Aviv University compared recommendations made by K Health’s AI Physician Mode to the final decisions made by physicians for 461 virtual urgent care visits. Here’s what they found:

  • In 68% of cases, the AI and physician recommendations were rated as equal
  • AI rated better on 21% of cases, versus just 11% for physicians 
  • AI recommendations were rated “optimal” in 77% of cases, versus 67% for physicians

Although AI takes the cake with the top line numbers, unpacking the data reveals some not-too-surprising strengths and weaknesses. AI was primarily rated better when physicians:

  • Missed important lab tests (22.8%)
  • Didn’t follow clinical guidelines (16.3%)
  • Failed to refer patients to specialists or the ED if needed (15.2%)
  • Overlooked risk factors and red flags (4.4%)

Physicians beat out AI when the human elements of care delivery came into play, such as adapting to new information or making nuanced decisions. Physicians were rated better when:

  • AI made unnecessary ED referrals (8.0%)
  • There was evolving or inconsistent information during consultations (6.2%)
  • They made necessary referrals that the AI missed (5.9%)
  • They correctly adjusted diagnoses based on visual examinations (4.4%)

While the study focused on the exact types of common conditions that AI excels at diagnosing (respiratory, urinary, vaginal, eye, and dental), it’s still impressive to see the outperformance in the messy trenches of a real clinical setting – a far cry from the static medical exams that have been the go-to for similar evaluations. 

The Takeaway

For AI to truly transform healthcare, it’ll need to do a lot more than automate administrative work and back office operations. This study demonstrates AI’s potential to enhance decision-making in actual medical practice, and points toward a future where delivering high-quality patient care becomes genuinely scalable.

PHTI Delivers Mixed Reviews on Ambient Scribes

The Peterson Health Technology Institute’s latest technology review is here, and it had a decidedly mixed report card for the ambient AI scribes sweeping across the industry. 

PHTI’s total count of ambient scribe vendors stands at over 60, but the bulk of its report focuses on the early experiences and lessons learned from the top 10 scribes across leading health systems.

According to PHTI’s conversations with health system execs, the primary driver of ambient scribe adoption has been addressing clinician burnout – and AI’s promise is clear on that front.

  • Mass General Brigham reported a 40% reduction in burnout during a six-week pilot.
  • MultiCare reported a 63% reduction in burnout and a 64% improvement in work-life balance.
  • Another study from the Permanente Medical Group found that 81% of patients felt their physician spent less time looking at their computer when using an ambient scribe.

Despite these drastic improvements, PHTI concludes that the financial returns and efficiency of ambient scribes remain unclear.

  • On one hand, enhanced documentation quality “could lead to higher reimbursements, potentially offsetting expenses.”
  • On the other hand, the cumulative costs “may be greater than any savings achieved through improved efficiency, reduced administrative burden, or reduced clinician attrition.”

It’s a bold conclusion considering the cost of losing a single provider, let alone the downstream effects of having a burned out workforce. 

PHTI’s advice to health systems? Define the outcomes you’re looking for and then measure ambient AI’s performance and financial impacts against those goals. Bit of a no-brainer, but sound advice nonetheless. 

The Takeaway

Ambient scribes are seeing the fastest adoption of any recent healthcare technology that wasn’t accompanied by a regulatory mandate, and that’s mostly because of magic that’s hard to capture in a spreadsheet. That said, health systems will eventually need to justify these solutions beyond their impact on the clinical experience, and PHTI’s report brings a solid framework and standardized methodologies for bridging that gap.

AI Misses the Mark on Detecting Critical Conditions

Most health systems have already begun turning to AI to predict if patient health conditions will deteriorate, but a new study in Nature Communications Medicine suggests that current models aren’t cut out for the task. 

Virginia Tech researchers looked at several popular machine learning models cited in medical literature for predicting patient deterioration, then fed them datasets about the health of patients in ICUs or with cancer.

  • They then created test cases for the models to predict potential health issues and risk scores in the event that patient metrics were changed from the initial dataset.

AI missed the mark. For in-hospital mortality prediction, the models tested using the synthesized cases failed to recognize a staggering 66% of relevant patient injuries.

  • In some instances, the models failed to generate adequate mortality risk scores for every single test case.
  • That’s clearly not great news, especially considering that algorithms that can’t recognize critical patient conditions obviously can’t alert doctors when urgent action is needed.

The study authors point out that it’s extremely important for technology being used in patient care decisions to incorporate medical knowledge, and that “purely data-driven training alone is not sufficient.”

  • Not only did the study unearth “alarming deficiencies” in models being used for in-hospital mortality predictions, but it also turned up similar concerns with models predicting the prognosis of breast and lung cancer over five-year periods.
  • The authors conclude that a significant gap exists between raw data and the complexities of medical reality, so models trained solely on patient data are “grossly insufficient and have many dangerous blind spots.”

The Takeaway

The promise of AI remains just as immense as ever, but studies like this provide constant reminders that we need a diligent approach to adoption – not just for the technology itself but for the lives of the patients it touches. Ensuring that medical knowledge gets incorporated into clinical AI models also seems like a theme that we’re about to start hearing more often.

Stress Testing Ambient AI Scribes

Providers are lining up to see if ambient AI can live up to its promise of decreasing burnout while improving the patient experience… and researchers are starting to wonder the same thing.

A new study in JAMA Network Open investigated whether ambient AI scribes actually decrease clinical note burden, following 46 clinicians at the University of Pennsylvania Health System as they used Nuance’s DAX Copilot AI ambient scribe from July to August 2024.

  • Researchers combined EHR data with a clinician survey to determine both quantitatively and qualitatively whether ambient scribes actually make a positive impact.

Here’s what they found. Over the course of the study, ambient scribe use was associated with:

  • 20.4% less time in notes per appointment (from 10.3 to 8.2 minutes)
  • 9.3% greater same-day appointment closure (from 66.2% to 72.4%)
  • 30.0% less after-hours work time per workday (from 50.6 to 35.4 minutes)

It’s tough to argue with the data. Ambient scribing definitely moves the needle on several important metrics, and even the less clear-cut stats still had a positive spin to them.

  • Note length was 20.6% greater with scribing (from 203k to 244k characters/wk)
  • However, the percentage of documentation that was typed by clinicians was 29.6% lower compared to baseline (from 11.2% to 7.9%)

The qualitative feedback told a different story. Even though clinicians reported feeling more engaged during patient conversations, “the need for substantial editing and proofreading of the AI-generated notes, which sometimes offset the time saved” was a recurring theme in the open-ended comments.

Ambient AI received a net promoter score of 0 on a scale of -100 to 100, meaning the clinicians were as likely to not recommend it as they were to recommend it.

  • 13 clinicians would recommend ambient AI to others, 13 wouldn’t recommend it, and 11 didn’t feel strongly either way.

The mixed reviews could mean that the ambient scribe performed better/worse for different users, but it could also mean that some clinicians were more diligent at checking the output.

The Takeaway

The evidence in favor of ambient AI scribes continues to pile up – even if the pajama-time reductions in this study didn’t live up to the promise on the box. Big technology shifts also come with adjustment periods, and this invited commentary did a great job highlighting the “real risk of automation bias” that comes with ambient AI, as well as the liability risk of missing its errors.

AI Enthusiasm Heats Up With Doctors

The unstoppable march of AI only seems to be gaining momentum, with an American Medical Association survey noting greater enthusiasm – and less apprehension – among physicians. 

The AMA’s Augmented Intelligence Research survey of 1,183 physicians found that those whose enthusiasm outweighs their concerns with health AI rose to 35% in 2024, up from 30% in 2023. 

  • The lion’s share of doctors recognize AI’s benefits, with 68% reporting at least some advantage in patient care (up from 63% in 2023).
  • In both years, about 40% of doctors were equally excited and concerned about health AI, with almost no change between surveys.

The positive sentiment could be stemming from more physicians using the tech in practice. AI use cases nearly doubled from 38% in 2023 to 66% in 2024.

  • The most common uses now include medical research, clinical documentation, and drafting care plans or discharge summaries.

The dramatic drop in non-users (62% to 33%) over the course of a year is impressive for any new health tech, but doctors in the latest survey called out several needs that have to be addressed for adoption to continue.

  • 88% wanted a designated feedback channel
  • 87% wanted data privacy assurances
  • 84% wanted EHR integration

While physicians are still concerned about the potential of AI to harm data privacy or offer incorrect recommendations (and liability risks), they’re also optimistic about its ability to put a dent in burnout.

  • The biggest area of opportunity for AI according to 57% of physicians was “addressing administrative burden through automation,” reclaiming the top spot it reached in 2023.
  • That said, nearly half of physicians (47%) ranked increased AI oversight as the number one regulatory action needed to increase trust in AI enough to drive further adoption.

The Takeaway

It’s encouraging to see the shifting sentiment around health AI, especially as more doctors embrace its potential to cut down on burnout. Although the survey pinpoints better oversight as the key to maximizing trust, AI innovation is moving so quickly that it wouldn’t be surprising if not-too-distant breakthroughs were magical enough to inspire more confidence on their own.

First Snapshot of AI Oversight at U.S. Hospitals

A beautiful paper in Health Affairs brought us the first snapshot of AI oversight at U.S. hospitals, as well as a glimpse of the blindspots that are already adding up.

Data from 2,425 hospitals that participated in the 2023 AHA Annual Survey shed light on the differences in AI adoption and evaluation capacity at hospitals on both sides of a growing divide.

Two-thirds of hospitals reported using AI predictive models, a figure that’s likely only gone up over the last year. These models were most commonly used to:

  • predict inpatient health trajectories (92%)
  • identify high-risk outpatients (79%)
  • facilitate scheduling (51%)
  • perform a long tail of various administrative tasks

Bias blindness ran rampant. Although 61% of the AI-user hospitals evaluated accuracy using data from their own system (local evaluation), only 44% performed similar evaluations for bias.

  • Those are some concerningly low percentages considering that models trained on external datasets might not be effective in different settings, and since AI bias is a surefire way to exacerbate health inequities
  • Hospitals that developed their own models, had high operating margins, and belonged to a health system were all more likely to conduct local evaluations. 

There’s a digital divide between hospitals with the resources to build models tailored to their own patients and those who are getting these solutions “off the shelf,” which increases the risk that they were trained on data from patients that might look very different from their own.

  • Only 54% of the AI hospitals designed their own models, while a larger share took the path of least resistance with algorithms supplied by their EHR developer (79%).
  • Combine that with the fact that most hospitals aren’t conducting local evaluations of bias, and there’s a major lack of systematic protection preventing these models from underrepresenting certain patients or adding unfair barriers to care.

The authors conclude that policymakers should “ensure the use of accurate and unbiased AI for patients regardless of where they receive care… including interventions designed to connect underresourced hospitals to evaluative capacity.”

The Takeaway

Without the local evaluation of AI models, there’s a glaring blindspot in the oversight of algorithmic bias, and this study gives compelling evidence that more needs to be done to fill that void.

House Task Force AI Policy Recommendations

The House Bipartisan Task Force on Artificial Intelligence closed out the year with a bang, launching 273-pages of AI policy fireworks.

The report includes recommendations to “advance America’s leadership in AI innovation” across multiple industries, and the healthcare section definitely packed a punch.

The task force started by highlighting AI’s potential across a long list of use cases, which could have been the tracklist for healthcare’s greatest hits of 2024:

  • Drug Development – 300+ drug applications contained AI components this year.
  • Ambient AI – Burnout is bad. Patient time is good.
  • Diagnostics – AI can help cut down on $100B in annual costs tied to diagnostic errors.
  • Population Health – Population-level data can feed models to improve various programs.

While many expect the Trump administration’s “AI Czar” David Sacks to take a less-is-more approach to AI regulation, the task force urged Congress to consider guardrails in key areas:

  • Data Availability, Utility, and Quality
  • Privacy and Cybersecurity
  • Interoperability
  • Transparency
  • Liability

Several recommendations were offered to ensure these guardrails are effective, although the task force didn’t go as far as to prescribe specific regulations. 

  • The report suggested that Congress establish clear liability standards given that they can affect clinical-decision making (the risk of penalties may change whether a provider relies on their judgment or defers to an algorithm).
  • Another common theme was to maintain robust support for healthcare research related to AI, which included more NIH funding since it’s “critical to maintaining U.S. leadership.” 

The capstone recommendation – which was naturally well-received by the industry – was to support appropriate AI payment mechanisms without stifling innovation.

  • CMS calculates reimbursements by accounting for physician time, acuity of care, and practice expenses, yet fails to adequately reimburse AI for impacting those metrics.
  • The task force said there won’t be a “one size fits all” policy, so appropriate payment mechanisms should recognize AI’s impact across multiple technologies and settings (Ex. many AI use cases may fit into existing benefit categories or facility fees).

The Takeaway

AI arrived faster than policy makers could keep up, and it’ll be up to the incoming White House to get AI past its Wild West regulatory era without hobbling the pioneers driving the progress. One way or another, that’s a sign that AI is starting a new chapter, and we’re excited to see where the story goes in 2025.

Real-World Lessons From NYU’s ChatGPT Roll Out

NYU Langone Health just lifted the curtain on its recent ChatGPT experiment, publishing an impressively candid look at all of the real-world data from its system-wide roll out.

A new article in JAMIA details the first six months of usage and cost metrics for NYU’s HIPAA-compliant version of ChatGPT 3.5 (dubbed GenAI Studio), and the numbers paint a promising picture of AI’s first steps in healthcare. Here’s a snapshot of the results:

Adoption

  • 1,007 users were onboarded (2.5% of NYU’s 40k employees)
  • GenAI Studio had 60 average weekly users (submitting 671 queries/week)
  • 27% of users interacted with GenAI Studio daily (Table: Usage Data)

Use Cases

  • Majority of users were from research and clinical departments
  • Most common use cases were writing, editing, data analysis, and idea generation
  • Examples: creating teaching materials for bedside nurses, drafting email responses, assessing clinical reasoning documentation, and SQL translation

Costs

  • 112M tokens were used during the six months of implementation 
  • Total token cost was $4,200 ($8,400 annualized)
  • Divide that cost by the 60 average weekly users, and it’s under $3 per user per week 

While initial adoption seems a bit low at 60 weekly users out of the 40k employees that were offered access, the wide range of helpful use cases and relatively low costs make ChatGPT pretty close to a no-brainer for improving productivity.

  • User surveys also gave GenAI Studio high marks for ease of use and overall experience, although many users noted difficulties with prompt construction and felt underprepared without more in-depth training.

NYU’s biggest tip for GenAI implementations: continuous engagement and education is key for driving adoption. GenAI Studio saw large spikes in new users and utilization following “prompt-a-thons” where employees could practice and get feedback on prompt construction.

The Takeaway

For healthcare organizations watching from the wings, NYU Langone Health was as transparent as it gets regarding the benefits and challenges of its system-wide roll out, and the case study serves up a practical playbook for similar AI deployments.

Oracle Announces AI-Powered EHR, QHIN

This week’s Oracle Health Summit in Nashville was a rodeo of announcements, and by this time next year it sounds like we could see both an entirely new AI-powered EHR and a freshly minted QHIN.

The biggest headline from the event was the unveiling of a next-generation EHR powered by AI, which will allow clinicians to use voice for conversational search and interactions.

  • The EHR is being developed from scratch rather than built on the Cerner Millennium architecture, which Oracle itself reported had a “crumbling infrastructure” that wasn’t a proper foundation for its roadmap.
  • The new platform will also embed Oracle’s AI agent and data analysis suite across all clinical workflows, while integrating with Oracle Health Command Center to provide better visibility into patient flow and staffing insights.

Not content with just a fancy new EHR, Oracle also announced that it’s pursuing a Qualified Health Information Network designation, making it the latest EHR to jump from CommonWell to the TEFCA bandwagon.

  • TEFCA sets technical requirements and exchange policies for clinical information sharing, and Oracle will now undergo robust technology and security testing before receiving its designation.
  • Oracle said that its guiding goal is to help streamline information exchange between payors and providers, simplify regulatory compliance, and help accelerate the adoption of VBC.

The news arrives as Oracle recorded its largest net hospital loss on record 2023. The only competitor to gain ground was long-time rival and current QHIN Epic, which welcomed Oracle’s QHIN application with a hilariously backhanded press release.

  • “Interoperability is a team sport, and Epic looks forward to Oracle Health getting off the sidelines and joining the game.” Fighting words for a company with information blocking lawsuits piling up.

The Takeaway

Regardless of how these moves play out, Oracle is undoubtedly taking some big shots that are refreshing to see. Only time will tell whether doctors who have spent years clicking through their EHR will be able to make the shift to voice, or if Oracle’s QHIN tech audit will go better than it’s VA roll out.

Get the top digital health stories right in your inbox

You might also like..

Select All

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Digital Health Wire team

You're all set!