PHTI Delivers Mixed Reviews on Ambient Scribes

The Peterson Health Technology Institute’s latest technology review is here, and it had a decidedly mixed report card for the ambient AI scribes sweeping across the industry. 

PHTI’s total count of ambient scribe vendors stands at over 60, but the bulk of its report focuses on the early experiences and lessons learned from the top 10 scribes across leading health systems.

According to PHTI’s conversations with health system execs, the primary driver of ambient scribe adoption has been addressing clinician burnout – and AI’s promise is clear on that front.

  • Mass General Brigham reported a 40% reduction in burnout during a six-week pilot.
  • MultiCare reported a 63% reduction in burnout and a 64% improvement in work-life balance.
  • Another study from the Permanente Medical Group found that 81% of patients felt their physician spent less time looking at their computer when using an ambient scribe.

Despite these drastic improvements, PHTI concludes that the financial returns and efficiency of ambient scribes remain unclear.

  • On one hand, enhanced documentation quality “could lead to higher reimbursements, potentially offsetting expenses.”
  • On the other hand, the cumulative costs “may be greater than any savings achieved through improved efficiency, reduced administrative burden, or reduced clinician attrition.”

It’s a bold conclusion considering the cost of losing a single provider, let alone the downstream effects of having a burned out workforce. 

PHTI’s advice to health systems? Define the outcomes you’re looking for and then measure ambient AI’s performance and financial impacts against those goals. Bit of a no-brainer, but sound advice nonetheless. 

The Takeaway

Ambient scribes are seeing the fastest adoption of any recent healthcare technology that wasn’t accompanied by a regulatory mandate, and that’s mostly because of magic that’s hard to capture in a spreadsheet. That said, health systems will eventually need to justify these solutions beyond their impact on the clinical experience, and PHTI’s report brings a solid framework and standardized methodologies for bridging that gap.

AI Misses the Mark on Detecting Critical Conditions

Most health systems have already begun turning to AI to predict if patient health conditions will deteriorate, but a new study in Nature Communications Medicine suggests that current models aren’t cut out for the task. 

Virginia Tech researchers looked at several popular machine learning models cited in medical literature for predicting patient deterioration, then fed them datasets about the health of patients in ICUs or with cancer.

  • They then created test cases for the models to predict potential health issues and risk scores in the event that patient metrics were changed from the initial dataset.

AI missed the mark. For in-hospital mortality prediction, the models tested using the synthesized cases failed to recognize a staggering 66% of relevant patient injuries.

  • In some instances, the models failed to generate adequate mortality risk scores for every single test case.
  • That’s clearly not great news, especially considering that algorithms that can’t recognize critical patient conditions obviously can’t alert doctors when urgent action is needed.

The study authors point out that it’s extremely important for technology being used in patient care decisions to incorporate medical knowledge, and that “purely data-driven training alone is not sufficient.”

  • Not only did the study unearth “alarming deficiencies” in models being used for in-hospital mortality predictions, but it also turned up similar concerns with models predicting the prognosis of breast and lung cancer over five-year periods.
  • The authors conclude that a significant gap exists between raw data and the complexities of medical reality, so models trained solely on patient data are “grossly insufficient and have many dangerous blind spots.”

The Takeaway

The promise of AI remains just as immense as ever, but studies like this provide constant reminders that we need a diligent approach to adoption – not just for the technology itself but for the lives of the patients it touches. Ensuring that medical knowledge gets incorporated into clinical AI models also seems like a theme that we’re about to start hearing more often.

Stress Testing Ambient AI Scribes

Providers are lining up to see if ambient AI can live up to its promise of decreasing burnout while improving the patient experience… and researchers are starting to wonder the same thing.

A new study in JAMA Network Open investigated whether ambient AI scribes actually decrease clinical note burden, following 46 clinicians at the University of Pennsylvania Health System as they used Nuance’s DAX Copilot AI ambient scribe from July to August 2024.

  • Researchers combined EHR data with a clinician survey to determine both quantitatively and qualitatively whether ambient scribes actually make a positive impact.

Here’s what they found. Over the course of the study, ambient scribe use was associated with:

  • 20.4% less time in notes per appointment (from 10.3 to 8.2 minutes)
  • 9.3% greater same-day appointment closure (from 66.2% to 72.4%)
  • 30.0% less after-hours work time per workday (from 50.6 to 35.4 minutes)

It’s tough to argue with the data. Ambient scribing definitely moves the needle on several important metrics, and even the less clear-cut stats still had a positive spin to them.

  • Note length was 20.6% greater with scribing (from 203k to 244k characters/wk)
  • However, the percentage of documentation that was typed by clinicians was 29.6% lower compared to baseline (from 11.2% to 7.9%)

The qualitative feedback told a different story. Even though clinicians reported feeling more engaged during patient conversations, “the need for substantial editing and proofreading of the AI-generated notes, which sometimes offset the time saved” was a recurring theme in the open-ended comments.

Ambient AI received a net promoter score of 0 on a scale of -100 to 100, meaning the clinicians were as likely to not recommend it as they were to recommend it.

  • 13 clinicians would recommend ambient AI to others, 13 wouldn’t recommend it, and 11 didn’t feel strongly either way.

The mixed reviews could mean that the ambient scribe performed better/worse for different users, but it could also mean that some clinicians were more diligent at checking the output.

The Takeaway

The evidence in favor of ambient AI scribes continues to pile up – even if the pajama-time reductions in this study didn’t live up to the promise on the box. Big technology shifts also come with adjustment periods, and this invited commentary did a great job highlighting the “real risk of automation bias” that comes with ambient AI, as well as the liability risk of missing its errors.

AI Enthusiasm Heats Up With Doctors

The unstoppable march of AI only seems to be gaining momentum, with an American Medical Association survey noting greater enthusiasm – and less apprehension – among physicians. 

The AMA’s Augmented Intelligence Research survey of 1,183 physicians found that those whose enthusiasm outweighs their concerns with health AI rose to 35% in 2024, up from 30% in 2023. 

  • The lion’s share of doctors recognize AI’s benefits, with 68% reporting at least some advantage in patient care (up from 63% in 2023).
  • In both years, about 40% of doctors were equally excited and concerned about health AI, with almost no change between surveys.

The positive sentiment could be stemming from more physicians using the tech in practice. AI use cases nearly doubled from 38% in 2023 to 66% in 2024.

  • The most common uses now include medical research, clinical documentation, and drafting care plans or discharge summaries.

The dramatic drop in non-users (62% to 33%) over the course of a year is impressive for any new health tech, but doctors in the latest survey called out several needs that have to be addressed for adoption to continue.

  • 88% wanted a designated feedback channel
  • 87% wanted data privacy assurances
  • 84% wanted EHR integration

While physicians are still concerned about the potential of AI to harm data privacy or offer incorrect recommendations (and liability risks), they’re also optimistic about its ability to put a dent in burnout.

  • The biggest area of opportunity for AI according to 57% of physicians was “addressing administrative burden through automation,” reclaiming the top spot it reached in 2023.
  • That said, nearly half of physicians (47%) ranked increased AI oversight as the number one regulatory action needed to increase trust in AI enough to drive further adoption.

The Takeaway

It’s encouraging to see the shifting sentiment around health AI, especially as more doctors embrace its potential to cut down on burnout. Although the survey pinpoints better oversight as the key to maximizing trust, AI innovation is moving so quickly that it wouldn’t be surprising if not-too-distant breakthroughs were magical enough to inspire more confidence on their own.

First Snapshot of AI Oversight at U.S. Hospitals

A beautiful paper in Health Affairs brought us the first snapshot of AI oversight at U.S. hospitals, as well as a glimpse of the blindspots that are already adding up.

Data from 2,425 hospitals that participated in the 2023 AHA Annual Survey shed light on the differences in AI adoption and evaluation capacity at hospitals on both sides of a growing divide.

Two-thirds of hospitals reported using AI predictive models, a figure that’s likely only gone up over the last year. These models were most commonly used to:

  • predict inpatient health trajectories (92%)
  • identify high-risk outpatients (79%)
  • facilitate scheduling (51%)
  • perform a long tail of various administrative tasks

Bias blindness ran rampant. Although 61% of the AI-user hospitals evaluated accuracy using data from their own system (local evaluation), only 44% performed similar evaluations for bias.

  • Those are some concerningly low percentages considering that models trained on external datasets might not be effective in different settings, and since AI bias is a surefire way to exacerbate health inequities
  • Hospitals that developed their own models, had high operating margins, and belonged to a health system were all more likely to conduct local evaluations. 

There’s a digital divide between hospitals with the resources to build models tailored to their own patients and those who are getting these solutions “off the shelf,” which increases the risk that they were trained on data from patients that might look very different from their own.

  • Only 54% of the AI hospitals designed their own models, while a larger share took the path of least resistance with algorithms supplied by their EHR developer (79%).
  • Combine that with the fact that most hospitals aren’t conducting local evaluations of bias, and there’s a major lack of systematic protection preventing these models from underrepresenting certain patients or adding unfair barriers to care.

The authors conclude that policymakers should “ensure the use of accurate and unbiased AI for patients regardless of where they receive care… including interventions designed to connect underresourced hospitals to evaluative capacity.”

The Takeaway

Without the local evaluation of AI models, there’s a glaring blindspot in the oversight of algorithmic bias, and this study gives compelling evidence that more needs to be done to fill that void.

House Task Force AI Policy Recommendations

The House Bipartisan Task Force on Artificial Intelligence closed out the year with a bang, launching 273-pages of AI policy fireworks.

The report includes recommendations to “advance America’s leadership in AI innovation” across multiple industries, and the healthcare section definitely packed a punch.

The task force started by highlighting AI’s potential across a long list of use cases, which could have been the tracklist for healthcare’s greatest hits of 2024:

  • Drug Development – 300+ drug applications contained AI components this year.
  • Ambient AI – Burnout is bad. Patient time is good.
  • Diagnostics – AI can help cut down on $100B in annual costs tied to diagnostic errors.
  • Population Health – Population-level data can feed models to improve various programs.

While many expect the Trump administration’s “AI Czar” David Sacks to take a less-is-more approach to AI regulation, the task force urged Congress to consider guardrails in key areas:

  • Data Availability, Utility, and Quality
  • Privacy and Cybersecurity
  • Interoperability
  • Transparency
  • Liability

Several recommendations were offered to ensure these guardrails are effective, although the task force didn’t go as far as to prescribe specific regulations. 

  • The report suggested that Congress establish clear liability standards given that they can affect clinical-decision making (the risk of penalties may change whether a provider relies on their judgment or defers to an algorithm).
  • Another common theme was to maintain robust support for healthcare research related to AI, which included more NIH funding since it’s “critical to maintaining U.S. leadership.” 

The capstone recommendation – which was naturally well-received by the industry – was to support appropriate AI payment mechanisms without stifling innovation.

  • CMS calculates reimbursements by accounting for physician time, acuity of care, and practice expenses, yet fails to adequately reimburse AI for impacting those metrics.
  • The task force said there won’t be a “one size fits all” policy, so appropriate payment mechanisms should recognize AI’s impact across multiple technologies and settings (Ex. many AI use cases may fit into existing benefit categories or facility fees).

The Takeaway

AI arrived faster than policy makers could keep up, and it’ll be up to the incoming White House to get AI past its Wild West regulatory era without hobbling the pioneers driving the progress. One way or another, that’s a sign that AI is starting a new chapter, and we’re excited to see where the story goes in 2025.

Real-World Lessons From NYU’s ChatGPT Roll Out

NYU Langone Health just lifted the curtain on its recent ChatGPT experiment, publishing an impressively candid look at all of the real-world data from its system-wide roll out.

A new article in JAMIA details the first six months of usage and cost metrics for NYU’s HIPAA-compliant version of ChatGPT 3.5 (dubbed GenAI Studio), and the numbers paint a promising picture of AI’s first steps in healthcare. Here’s a snapshot of the results:

Adoption

  • 1,007 users were onboarded (2.5% of NYU’s 40k employees)
  • GenAI Studio had 60 average weekly users (submitting 671 queries/week)
  • 27% of users interacted with GenAI Studio daily (Table: Usage Data)

Use Cases

  • Majority of users were from research and clinical departments
  • Most common use cases were writing, editing, data analysis, and idea generation
  • Examples: creating teaching materials for bedside nurses, drafting email responses, assessing clinical reasoning documentation, and SQL translation

Costs

  • 112M tokens were used during the six months of implementation 
  • Total token cost was $4,200 ($8,400 annualized)
  • Divide that cost by the 60 average weekly users, and it’s under $3 per user per week 

While initial adoption seems a bit low at 60 weekly users out of the 40k employees that were offered access, the wide range of helpful use cases and relatively low costs make ChatGPT pretty close to a no-brainer for improving productivity.

  • User surveys also gave GenAI Studio high marks for ease of use and overall experience, although many users noted difficulties with prompt construction and felt underprepared without more in-depth training.

NYU’s biggest tip for GenAI implementations: continuous engagement and education is key for driving adoption. GenAI Studio saw large spikes in new users and utilization following “prompt-a-thons” where employees could practice and get feedback on prompt construction.

The Takeaway

For healthcare organizations watching from the wings, NYU Langone Health was as transparent as it gets regarding the benefits and challenges of its system-wide roll out, and the case study serves up a practical playbook for similar AI deployments.

Oracle Announces AI-Powered EHR, QHIN

This week’s Oracle Health Summit in Nashville was a rodeo of announcements, and by this time next year it sounds like we could see both an entirely new AI-powered EHR and a freshly minted QHIN.

The biggest headline from the event was the unveiling of a next-generation EHR powered by AI, which will allow clinicians to use voice for conversational search and interactions.

  • The EHR is being developed from scratch rather than built on the Cerner Millennium architecture, which Oracle itself reported had a “crumbling infrastructure” that wasn’t a proper foundation for its roadmap.
  • The new platform will also embed Oracle’s AI agent and data analysis suite across all clinical workflows, while integrating with Oracle Health Command Center to provide better visibility into patient flow and staffing insights.

Not content with just a fancy new EHR, Oracle also announced that it’s pursuing a Qualified Health Information Network designation, making it the latest EHR to jump from CommonWell to the TEFCA bandwagon.

  • TEFCA sets technical requirements and exchange policies for clinical information sharing, and Oracle will now undergo robust technology and security testing before receiving its designation.
  • Oracle said that its guiding goal is to help streamline information exchange between payors and providers, simplify regulatory compliance, and help accelerate the adoption of VBC.

The news arrives as Oracle recorded its largest net hospital loss on record 2023. The only competitor to gain ground was long-time rival and current QHIN Epic, which welcomed Oracle’s QHIN application with a hilariously backhanded press release.

  • “Interoperability is a team sport, and Epic looks forward to Oracle Health getting off the sidelines and joining the game.” Fighting words for a company with information blocking lawsuits piling up.

The Takeaway

Regardless of how these moves play out, Oracle is undoubtedly taking some big shots that are refreshing to see. Only time will tell whether doctors who have spent years clicking through their EHR will be able to make the shift to voice, or if Oracle’s QHIN tech audit will go better than it’s VA roll out.

Patients Ready For GenAI, But Not For Everything

Bain & Company’s US Frontline of Consumer Healthcare Survey turned up the surprising result that patients are more comfortable with generative AI “analyzing their radiology scan and making a diagnosis than answering the phone at their doctor’s office.”

That’s quite the headline, but the authors were quick to point out that it’s probably less of a measure of confidence in GenAI’s medical expertise than a sign that patients aren’t yet comfortable interacting with the technology directly.

Here’s the breakdown of patient comfort with different GenAI use cases:

While it does appear that patients are more prepared to have GenAI supporting their doctor than engaging with it themselves, it’s just as notable that less than half reported feeling comfortable with even a single GenAI application in healthcare.

  • No “comfortable” response was above 37%, and after adding in the “neutral” votes, there was still only one application that broke 50%: note taking during appointments.
  • The fact that only 19% felt comfortable with GenAI answering calls for providers or payors could also just be a sign that patients would far rather talk to a human in either situation, regardless of the tech’s capabilities.

The next chart looks at GenAI perceptions among healthcare workers: 

Physicians and administrators are feeling a similar mix of excitement and apprehension, sharing a generally positive view of GenAI’s potential to alleviate admin burdens and clinician workloads, as well as a concern that it could undermine the patient-provider relationship.

  • Worries over new technology threatening the relationship of patients and providers aren’t new, and we just witnessed them play out at an accelerated pace with telehealth.
  • Despite initial fears, the value of the relationship prevailed, which Bain backed up with the fact that 61% of patients who use telehealth only do so with their own provider.

Whether you’re measuring by patient or provider comfort, GenAI’s progress will be closely tied to trust in the technology on an application-by-application basis. Trust takes time to build and first impressions are key, so this survey underscores the importance of nailing the user experience early on.

The Takeaway
The story of generative AI in healthcare is just getting started, and as we saw with telehealth, the first few pages could take some serious willpower to get through. New technologies mean new workflows, revenue models, and countless other barriers to overcome, but trust will only keep building every step of the way. Plus, the next chapter looks pretty dang good.

Storytime at Epic UGM 2024

Epic’s “Storytime” User Group Meeting is officially a wrap, and the number of updates shared at the event would be hard pressed to keep with the theme and fit in a children’s book.

CEO Judy Faulkner donned the podium dressed as Mother Goose to tell the tale of Epic’s recent advances, AI roadmap, and even a “25-to-50-year” company plan.

It wouldn’t be a 2024 UGM without AI hogging the spotlight, and the EHR behemoth certainly delivered on that front. Highlights included:

  • Epic currently has two killer use cases for AI. Medical scribes (186 user orgs), and draft responses to portal messages (150 user orgs). Those counts reflect the number of “user” orgs, but it wasn’t clear how many have done system-wide deployments.
  • Epic is actively working on over 100 new GenAI solutions, ranging from auto-populating forms and discharge papers to delivering evidence-based insights at the point of care.
  • Epic Cosmos’ Look-Alikes AI tool is now live at 65 sites, helping identify rare diseases by cross-referencing symptoms in its database of over 226M patient records and connecting physicians with kindred cases.

The teasers stole the show, and physicians (or payors!) have plenty to look forward to if Epic can deliver.

  • An upcoming Best Care Choices for My Patient tool will provide treatment recommendations at the point of care based on what worked / didn’t work for similar patients. NYU Langone and Parkview Health are already test-driving the solution.
  • A new Payor Platform is now available to all health system customers, with AI features to streamline prior auths, manage claims denials, and connect provider directories. Epic is also exploring how to cut out clearinghouse middlemen by sending PA documentation directly to payors.
  • By the end of next year, MyChart’s GenAI will be able to pull in test results, medications, and other patient details to better customize draft messages and help automatically queue up orders for labs and prescriptions.
  • A Teamwork staff scheduling application is sparse on details but on the way “soon.”

The Takeaway

Given how much time clinicians spend in the EHR and the treasure trove of data it holds, it isn’t a surprise that Epic has become an integral component of its health systems’ AI strategy. That said, user group meetings are meant to excite user groups, and we’ll know soon enough how many of these announcements were just Storytime.

Get the top digital health stories right in your inbox

You might also like..

Select All

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Digital Health Wire team

You're all set!