House Task Force AI Policy Recommendations

The House Bipartisan Task Force on Artificial Intelligence closed out the year with a bang, launching 273-pages of AI policy fireworks.

The report includes recommendations to “advance America’s leadership in AI innovation” across multiple industries, and the healthcare section definitely packed a punch.

The task force started by highlighting AI’s potential across a long list of use cases, which could have been the tracklist for healthcare’s greatest hits of 2024:

  • Drug Development – 300+ drug applications contained AI components this year.
  • Ambient AI – Burnout is bad. Patient time is good.
  • Diagnostics – AI can help cut down on $100B in annual costs tied to diagnostic errors.
  • Population Health – Population-level data can feed models to improve various programs.

While many expect the Trump administration’s “AI Czar” David Sacks to take a less-is-more approach to AI regulation, the task force urged Congress to consider guardrails in key areas:

  • Data Availability, Utility, and Quality
  • Privacy and Cybersecurity
  • Interoperability
  • Transparency
  • Liability

Several recommendations were offered to ensure these guardrails are effective, although the task force didn’t go as far as to prescribe specific regulations. 

  • The report suggested that Congress establish clear liability standards given that they can affect clinical-decision making (the risk of penalties may change whether a provider relies on their judgment or defers to an algorithm).
  • Another common theme was to maintain robust support for healthcare research related to AI, which included more NIH funding since it’s “critical to maintaining U.S. leadership.” 

The capstone recommendation – which was naturally well-received by the industry – was to support appropriate AI payment mechanisms without stifling innovation.

  • CMS calculates reimbursements by accounting for physician time, acuity of care, and practice expenses, yet fails to adequately reimburse AI for impacting those metrics.
  • The task force said there won’t be a “one size fits all” policy, so appropriate payment mechanisms should recognize AI’s impact across multiple technologies and settings (Ex. many AI use cases may fit into existing benefit categories or facility fees).

The Takeaway

AI arrived faster than policy makers could keep up, and it’ll be up to the incoming White House to get AI past its Wild West regulatory era without hobbling the pioneers driving the progress. One way or another, that’s a sign that AI is starting a new chapter, and we’re excited to see where the story goes in 2025.

Real-World Lessons From NYU’s ChatGPT Roll Out

NYU Langone Health just lifted the curtain on its recent ChatGPT experiment, publishing an impressively candid look at all of the real-world data from its system-wide roll out.

A new article in JAMIA details the first six months of usage and cost metrics for NYU’s HIPAA-compliant version of ChatGPT 3.5 (dubbed GenAI Studio), and the numbers paint a promising picture of AI’s first steps in healthcare. Here’s a snapshot of the results:

Adoption

  • 1,007 users were onboarded (2.5% of NYU’s 40k employees)
  • GenAI Studio had 60 average weekly users (submitting 671 queries/week)
  • 27% of users interacted with GenAI Studio daily (Table: Usage Data)

Use Cases

  • Majority of users were from research and clinical departments
  • Most common use cases were writing, editing, data analysis, and idea generation
  • Examples: creating teaching materials for bedside nurses, drafting email responses, assessing clinical reasoning documentation, and SQL translation

Costs

  • 112M tokens were used during the six months of implementation 
  • Total token cost was $4,200 ($8,400 annualized)
  • Divide that cost by the 60 average weekly users, and it’s under $3 per user per week 

While initial adoption seems a bit low at 60 weekly users out of the 40k employees that were offered access, the wide range of helpful use cases and relatively low costs make ChatGPT pretty close to a no-brainer for improving productivity.

  • User surveys also gave GenAI Studio high marks for ease of use and overall experience, although many users noted difficulties with prompt construction and felt underprepared without more in-depth training.

NYU’s biggest tip for GenAI implementations: continuous engagement and education is key for driving adoption. GenAI Studio saw large spikes in new users and utilization following “prompt-a-thons” where employees could practice and get feedback on prompt construction.

The Takeaway

For healthcare organizations watching from the wings, NYU Langone Health was as transparent as it gets regarding the benefits and challenges of its system-wide roll out, and the case study serves up a practical playbook for similar AI deployments.

Oracle Announces AI-Powered EHR, QHIN

This week’s Oracle Health Summit in Nashville was a rodeo of announcements, and by this time next year it sounds like we could see both an entirely new AI-powered EHR and a freshly minted QHIN.

The biggest headline from the event was the unveiling of a next-generation EHR powered by AI, which will allow clinicians to use voice for conversational search and interactions.

  • The EHR is being developed from scratch rather than built on the Cerner Millennium architecture, which Oracle itself reported had a “crumbling infrastructure” that wasn’t a proper foundation for its roadmap.
  • The new platform will also embed Oracle’s AI agent and data analysis suite across all clinical workflows, while integrating with Oracle Health Command Center to provide better visibility into patient flow and staffing insights.

Not content with just a fancy new EHR, Oracle also announced that it’s pursuing a Qualified Health Information Network designation, making it the latest EHR to jump from CommonWell to the TEFCA bandwagon.

  • TEFCA sets technical requirements and exchange policies for clinical information sharing, and Oracle will now undergo robust technology and security testing before receiving its designation.
  • Oracle said that its guiding goal is to help streamline information exchange between payors and providers, simplify regulatory compliance, and help accelerate the adoption of VBC.

The news arrives as Oracle recorded its largest net hospital loss on record 2023. The only competitor to gain ground was long-time rival and current QHIN Epic, which welcomed Oracle’s QHIN application with a hilariously backhanded press release.

  • “Interoperability is a team sport, and Epic looks forward to Oracle Health getting off the sidelines and joining the game.” Fighting words for a company with information blocking lawsuits piling up.

The Takeaway

Regardless of how these moves play out, Oracle is undoubtedly taking some big shots that are refreshing to see. Only time will tell whether doctors who have spent years clicking through their EHR will be able to make the shift to voice, or if Oracle’s QHIN tech audit will go better than it’s VA roll out.

Patients Ready For GenAI, But Not For Everything

Bain & Company’s US Frontline of Consumer Healthcare Survey turned up the surprising result that patients are more comfortable with generative AI “analyzing their radiology scan and making a diagnosis than answering the phone at their doctor’s office.”

That’s quite the headline, but the authors were quick to point out that it’s probably less of a measure of confidence in GenAI’s medical expertise than a sign that patients aren’t yet comfortable interacting with the technology directly.

Here’s the breakdown of patient comfort with different GenAI use cases:

While it does appear that patients are more prepared to have GenAI supporting their doctor than engaging with it themselves, it’s just as notable that less than half reported feeling comfortable with even a single GenAI application in healthcare.

  • No “comfortable” response was above 37%, and after adding in the “neutral” votes, there was still only one application that broke 50%: note taking during appointments.
  • The fact that only 19% felt comfortable with GenAI answering calls for providers or payors could also just be a sign that patients would far rather talk to a human in either situation, regardless of the tech’s capabilities.

The next chart looks at GenAI perceptions among healthcare workers: 

Physicians and administrators are feeling a similar mix of excitement and apprehension, sharing a generally positive view of GenAI’s potential to alleviate admin burdens and clinician workloads, as well as a concern that it could undermine the patient-provider relationship.

  • Worries over new technology threatening the relationship of patients and providers aren’t new, and we just witnessed them play out at an accelerated pace with telehealth.
  • Despite initial fears, the value of the relationship prevailed, which Bain backed up with the fact that 61% of patients who use telehealth only do so with their own provider.

Whether you’re measuring by patient or provider comfort, GenAI’s progress will be closely tied to trust in the technology on an application-by-application basis. Trust takes time to build and first impressions are key, so this survey underscores the importance of nailing the user experience early on.

The Takeaway
The story of generative AI in healthcare is just getting started, and as we saw with telehealth, the first few pages could take some serious willpower to get through. New technologies mean new workflows, revenue models, and countless other barriers to overcome, but trust will only keep building every step of the way. Plus, the next chapter looks pretty dang good.

Storytime at Epic UGM 2024

Epic’s “Storytime” User Group Meeting is officially a wrap, and the number of updates shared at the event would be hard pressed to keep with the theme and fit in a children’s book.

CEO Judy Faulkner donned the podium dressed as Mother Goose to tell the tale of Epic’s recent advances, AI roadmap, and even a “25-to-50-year” company plan.

It wouldn’t be a 2024 UGM without AI hogging the spotlight, and the EHR behemoth certainly delivered on that front. Highlights included:

  • Epic currently has two killer use cases for AI. Medical scribes (186 user orgs), and draft responses to portal messages (150 user orgs). Those counts reflect the number of “user” orgs, but it wasn’t clear how many have done system-wide deployments.
  • Epic is actively working on over 100 new GenAI solutions, ranging from auto-populating forms and discharge papers to delivering evidence-based insights at the point of care.
  • Epic Cosmos’ Look-Alikes AI tool is now live at 65 sites, helping identify rare diseases by cross-referencing symptoms in its database of over 226M patient records and connecting physicians with kindred cases.

The teasers stole the show, and physicians (or payors!) have plenty to look forward to if Epic can deliver.

  • An upcoming Best Care Choices for My Patient tool will provide treatment recommendations at the point of care based on what worked / didn’t work for similar patients. NYU Langone and Parkview Health are already test-driving the solution.
  • A new Payor Platform is now available to all health system customers, with AI features to streamline prior auths, manage claims denials, and connect provider directories. Epic is also exploring how to cut out clearinghouse middlemen by sending PA documentation directly to payors.
  • By the end of next year, MyChart’s GenAI will be able to pull in test results, medications, and other patient details to better customize draft messages and help automatically queue up orders for labs and prescriptions.
  • A Teamwork staff scheduling application is sparse on details but on the way “soon.”

The Takeaway

Given how much time clinicians spend in the EHR and the treasure trove of data it holds, it isn’t a surprise that Epic has become an integral component of its health systems’ AI strategy. That said, user group meetings are meant to excite user groups, and we’ll know soon enough how many of these announcements were just Storytime.

Hidden Flaws Behind High Accuracy of Clinical AI

AI is getting pretty darn good at patient diagnosis challenges… but don’t bother asking it to show its work.

A new study in npj Digital Medicine pitted GPT-4V against human physicians on 207 image challenges designed to test the reader’s ability to diagnose a patient based on a series of pictures and some basic clinical background info.

  • Researchers at the NIH and Weill Cornell Medicine then asked GPT-4V to provide step-by-step reasoning for how it chose the answer.
  • Nine physicians then tackled the same questions in both a closed-book (no outside help) and open-book format (could use outside materials and online resources).

How’d they stack up?

  • GPT-4V and the physicians both scored high marks for accurate diagnoses (81.6% vs. 77.8%), with a statistically insignificant difference in performance. 
  • GPT-4V bested the physicians on the closed-book test, selecting more correct diagnoses.
  • Physicians bounced back to beat GPT-4V on the open-book test, particularly on the most difficult questions.
  • GPT-4V also performed well in cases where physicians answered incorrectly, maintaining over 78% accuracy.

Good job AI, but there’s a catch. The rationales that GPT-4V provided were riddled with mistakes – even if the final answer was correct – with error rates as high as 27% for image comprehension.

The Takeaway

There could easily come a day when clinical AI surpasses human physicians on the diagnosis front, but that day isn’t here quite yet. Real care delivery also doesn’t bless physicians with a set of multiple choice options, and hallucinating the rationale behind diagnoses doesn’t cut it with actual patients.

Mayo Clinic Tops Hospital AI Readiness Index

The ambient temperature is rising, and CB Insights just launched its Hospital AI Readiness Index to determine which health systems are most prepared for the shift.

The index is based on an analysis of top private-sector systems in the U.S. (by hospital count), ranked by how prepared they are to adapt to a rapidly evolving AI landscape across two key pillars: 

  • Innovation – measures a system’s track record of developing or acquiring novel AI capabilities, also considers the presence of an AI-dedicated research center
  • Execution – measures a system’s ability to bring AI into clinical practice, also considers internal AI deployments across business and back-office functions 

Without further ado, here’s CB Insight’s first list of AI-ready systems:

Mayo Clinic topped the innovation charts by leading all systems in terms of raw AI investment count (including participation in big rounds from Abridge and Cerebras Systems), while also filing 50+ AI patents in areas like cardiovascular health and oncology.

  • Intermountain ranked second due in part to the AI focus of its venture arm, which invested in Gyant prior to the engagement platform getting scooped up by Fabric.
  • Cleveland Clinic rounded out the top three with a high volume of AI partnerships, including work with PathAI to enhance translational research using pathology algorithms.

High execution scores were driven by AI business relationships and product launches, such as Mayo Clinic’s teaming up with Techcyte to help providers use AI to improve lab testing.

  • Another standout on this front was Banner Health, which is working with Regard to cut down on administrative burdens by automating tasks like notetaking and chart reviews.
  • Johns Hopkins also received high marks after partnering with Healthy.io to offer digital wound care services to patients.

The Takeaway

It’s tough not to love a good stack-ranking of health systems, and this is the best we’ve come across for AI readiness (and potential AI partners). Hats off to the 25 systems that made CB Insights’ inaugural list!

Augmedix Takes Hit As Ambient AI Heats Up

Augmedix just reported Q1 results that managed to axe its share price in half, an interesting turn of events given the company’s role as the bellwether for the white hot ambient AI space.

There’s plenty to unpack when the only publicly-traded medical scribe company takes a hit like that despite beating expectations for both EPS and revenue, which jumped 40% to $13.5M.

The simple explanation? Competition. Augmedix saw “a slow-down in purchasing commitments” as providers evaluate competing offerings, prompting it to cut its full-year revenue forecast to between $52M and $55M (down from $60M to $62M).

  • During the investor call, Augmedix said that 42 companies currently offer GenAI medical documentation solutions, leading to a ton of noise and just as many pilots.
  • Although the increased demand from health systems is promising for the overall sector, it doesn’t exactly translate to success for established players when nimble startups like Nabla, Abridge, and Suki start swarming in on the action.

Augmedix is shaping its strategy around a product portfolio that lets providers choose the right tool for their needs, expanding beyond Augmedix Live (human scribes, high cost) with Augmedix Go (GenAI scribe, low cost) and Augmedix Go Assist (GenAI + human review, medium cost).

  • The push into GenAI has apparently been a double-edged sword. Augmedix reported that strong uptake for its new AI products might result in slower revenue growth as customers transition away from its high-margin Live solution.
  • New products tailored to specific settings will be another focus, as seen with the recent debut of Augmedix Go ED following a pilot-turned-implementation at HCA Healthcare. As scribing tech becomes commoditized, expect to see more players differentiate on setting / specialty.

The Takeaway

If there’s one lesson to learn from Augmedix’s first quarter, it’s that business is booming in the ambient AI space, but that doesn’t benefit incumbent leaders when it also attracts hungry competitors looking to feast on the same momentum.

K Health Introduces First-of-its-Kind AI Knowledge Agent

Clinical AI is stepping up to the big leagues, and K Health is the team that’s taking it there.

In an exclusive interview with Digital Health Wire, K Health CEO Allon Bloch took the lid off his company’s new AI Knowledge Agent, a first-of-its-kind GenAI system purpose-built for the clinical setting.

On the surface the AI Knowledge Agent looks and feels like a familiar medical chatbot, with a simple search bar interface for the user to ask natural language questions about their health. It isn’t until you see the responses that you realize you’re looking at something entirely unique.

The AI Knowledge Agent is about as far away from a rules-based chatbot as you can get. The agent is composed of an array of large language models enhanced by K Health’s own algorithms, carrying several major differentiators from today’s standard AI applications:

  • It incorporates the patient’s medical history grounded by their EHR to provide highly tailored responses, enabling a level of personalization that’s impossible to match for standalone models (i.e. a diabetic and a heart failure patient will see different answers to the same question, using their own history, potential adverse drug interactions, etc.).
  • It will be embedded into health systems to serve as a digital front door that intelligently routes patients to the right place to resolve their needs, reaching everything from primary care and specialists to labs and tests within the same interface.
  • It’s optimized for accuracy by using curated high-quality health sources, then leverages multiple specialized agents to verify the answer matches the sources and the EHR data is appropriate. It will even tell you that it doesn’t know the answer rather than hallucinate.

In head-to-head testing against top tier foundation models, K Health’s multi-agent approach led to answers for sample medical questions that were 9% more comprehensive (included clinically crucial statements from the “gold standard” answer) and had 36% fewer hallucinations than its closest benchmark, GPT-4. 

  • Strong results, especially considering that the AI Knowledge Agent shines brightest in real-world situations where it can personalize its answers using EHR context.

For possibly the first time ever, GenAI has reached the point where it can support actual clinical journeys, delivering answers personalized to the patient’s medical history while connecting them directly to required care. The era of Googling symptoms then calling your doctor feels like it’s finally coming to an end.

The Takeaway

We’re very much in the opening act of clinical AI, and understandably cautious providers are only just beginning to test the waters. That said, it’s easy to imagine that we’ll one day look back at launches like K Health’s AI Knowledge Agent as key moments for building trust and confidence in the AI systems that reshaped care delivery.

GenAI Still Working Toward Prime Time With Patients

When it rains it pours for AI research, and a trio of studies published just last week suggest that many new generative AI tools might not be ready for prime time with patients.

The research that grabbed the most headlines came out of UCSD, finding that GenAI-drafted replies to patient messages led to more compassionate responses, but didn’t cut down on overall messaging time.

  • Although GenAI reduced the time physicians spent writing replies by 6%, that was more than offset by a 22% increase in read time, while also increasing average reply lengths by 18%.
  • Some of the physicians were also put off by the “overly nice” tone of the GenAI message drafts, and recommended that future research look into “how much empathy is too much empathy” from the patient perspective.

Another study in Lancet Digital Health showed that GPT-4 can effectively generate replies to health questions from cancer patients… as well as replies that might kill them.

  • Mass General Brigham researchers had six radiation oncologists review GPT-4’s responses to simulated questions from cancer patients for 100 scenarios, finding that 58% of its replies were acceptable to send to patients without any editing, 7% could lead to severe harm, and one was potentially lethal.
  • The verdict? Generative AI has the potential to reduce workloads, but it’s still essential to “keep doctors in the loop.”

A team at Mount Sinai took a different path to a similar conclusion, finding that four popular GenAI models have a long way to go until they’re better than humans at matching medical issues to the correct diagnostic codes.

  • After having GPT-3.5, GPT-4, Gemini Pro, and Llama2-70b analyze and code 27,000 unique diagnoses, GPT-4 came out on top in terms of exact matches, achieving an uninspiring accuracy of 49.8%.

The Takeaway

While it isn’t exactly earth-shattering news that GenAI still has room to improve, the underlying theme with each of these studies is more that its impact is far from black and white. GenAI is rarely completely right or completely wrong, and although there’s no doubt we’ll get to the point where it’s working its magic without as many tradeoffs, this research confirms that we’re definitely not there yet.

Get the top digital health stories right in your inbox

You might also like..

Select All

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Digital Health Wire team

You're all set!