A beautiful paper in Health Affairs brought us the first snapshot of AI oversight at U.S. hospitals, as well as a glimpse of the blindspots that are already adding up.
Data from 2,425 hospitals that participated in the 2023 AHA Annual Survey shed light on the differences in AI adoption and evaluation capacity at hospitals on both sides of a growing divide.
Two-thirds of hospitals reported using AI predictive models, a figure that’s likely only gone up over the last year. These models were most commonly used to:
- predict inpatient health trajectories (92%)
- identify high-risk outpatients (79%)
- facilitate scheduling (51%)
- perform a long tail of various administrative tasks
Bias blindness ran rampant. Although 61% of the AI-user hospitals evaluated accuracy using data from their own system (local evaluation), only 44% performed similar evaluations for bias.
- Those are some concerningly low percentages considering that models trained on external datasets might not be effective in different settings, and since AI bias is a surefire way to exacerbate health inequities.
- Hospitals that developed their own models, had high operating margins, and belonged to a health system were all more likely to conduct local evaluations.
There’s a digital divide between hospitals with the resources to build models tailored to their own patients and those who are getting these solutions “off the shelf,” which increases the risk that they were trained on data from patients that might look very different from their own.
- Only 54% of the AI hospitals designed their own models, while a larger share took the path of least resistance with algorithms supplied by their EHR developer (79%).
- Combine that with the fact that most hospitals aren’t conducting local evaluations of bias, and there’s a major lack of systematic protection preventing these models from underrepresenting certain patients or adding unfair barriers to care.
The authors conclude that policymakers should “ensure the use of accurate and unbiased AI for patients regardless of where they receive care… including interventions designed to connect underresourced hospitals to evaluative capacity.”
The Takeaway
Without the local evaluation of AI models, there’s a glaring blindspot in the oversight of algorithmic bias, and this study gives compelling evidence that more needs to be done to fill that void.