AI Misses the Mark on Detecting Critical Conditions

Most health systems have already begun turning to AI to predict if patient health conditions will deteriorate, but a new study in Nature Communications Medicine suggests that current models aren’t cut out for the task. 

Virginia Tech researchers looked at several popular machine learning models cited in medical literature for predicting patient deterioration, then fed them datasets about the health of patients in ICUs or with cancer.

  • They then created test cases for the models to predict potential health issues and risk scores in the event that patient metrics were changed from the initial dataset.

AI missed the mark. For in-hospital mortality prediction, the models tested using the synthesized cases failed to recognize a staggering 66% of relevant patient injuries.

  • In some instances, the models failed to generate adequate mortality risk scores for every single test case.
  • That’s clearly not great news, especially considering that algorithms that can’t recognize critical patient conditions obviously can’t alert doctors when urgent action is needed.

The study authors point out that it’s extremely important for technology being used in patient care decisions to incorporate medical knowledge, and that “purely data-driven training alone is not sufficient.”

  • Not only did the study unearth “alarming deficiencies” in models being used for in-hospital mortality predictions, but it also turned up similar concerns with models predicting the prognosis of breast and lung cancer over five-year periods.
  • The authors conclude that a significant gap exists between raw data and the complexities of medical reality, so models trained solely on patient data are “grossly insufficient and have many dangerous blind spots.”

The Takeaway

The promise of AI remains just as immense as ever, but studies like this provide constant reminders that we need a diligent approach to adoption – not just for the technology itself but for the lives of the patients it touches. Ensuring that medical knowledge gets incorporated into clinical AI models also seems like a theme that we’re about to start hearing more often.

Study: AI is in the Eye of the Beholder

At a time when new healthcare AI solutions are getting unveiled every week, a study in Nature Machine Intelligence found that the way people are introduced to these models can have a major effect on their perceived effectiveness.

Researchers from MIT and ASU had 310 participants interact with a conversational AI mental health companion for 30 minutes before reviewing their experience and determining whether they would recommend it to a friend.

Participants were divided into three groups, which were each given a different priming statement about the AI’s motives:

  • No motives: A neutral view of the AI as a tool
  • Caring motives: A positive view where the AI cares about the user’s well-being
  • Manipulative motives: A negative view where the AI has malicious intentions

The results revealed that priming statements certainly influence user perceptions, and the majority of participants in all three groups reported experiences in line with expectations.

  • 88% of the “caring” group and 79% of the “no motive” group believed the AI was empathetic or neutral – despite the fact that they were engaging with identical agents.
  • Only 44% of the “manipulative” group agreed with the primer. As the authors put it, “If you tell someone to be suspicious of something, then they might just be more suspicious in general.”
  • As might be expected, participants who believed the model was caring also gave it higher effectiveness scores and were more likely to recommend it to a friend. That’s obviously relevant for those developing similar mental health chatbots, but a key insight for presenting any AI agent to new users.

An interesting feedback loop was also found between the priming and the conversation’s tone. People who believed the AI was caring tended to interact with it in a more positive way, making the agent’s responses drift positively over time. The opposite was true for those who believed it was manipulative. 

The Takeaway

The placebo effect is a well documented cornerstone of medical literature, but this might be the first study to bridge the phenomenon from sugar pill to AI chatbot. Although AI is often thought of as primarily an engineering problem, this research does a great job highlighting how human factors and the power of belief play a huge role in the perceived effectiveness of the technology.

Can Wearables Help Measure Patient Outcomes?

Researchers from the University of Edinburgh published a systematic review in Nature that aimed to determine the current evidence base and reporting quality for mobile digital health interventions (DHI) in the postoperative period following surgery.

Methodology – After screening 6,969 articles for patients undergoing surgeries where postoperative outcomes were measured using DHIs (defined as mobile technologies to improve health system efficiency and health outcomes), 44 studies were included in the final review.

Results – The review indicated that several types of mobile phone- or wearables-generated data can improve the assessment of postoperative recovery:

  • patient-reported outcome data (from validated self-report tools)
  • continuous activity data (from wearables)
  • combining remote assessment with active clinical prompts or patient advice

DHI Shortcomings – Studies included in the analysis demonstrated that DHIs may facilitate patient recovery following major operations and reduce inappropriate service use, although they also revealed issues with the current evidence base that should be addressed:

  • patients are rarely engaged in the development of DHIs
  • only one study was designed to engage patients in reviewing their own data
  • high levels of exclusion exist for patients without relevant mobile technology

Discussion

The increasing availability of high quality mobile technologies provides a new bridge between clinical services and patients’ homes, and while the authors of the study are optimistic about the technology, they stress the importance of improving reporting standards if its potential is to be fulfilled.

Going forward, the researchers suggest that studies of DHIs in postoperative settings seek to provide meaningful comparisons to non-DHI care in order to demonstrate clinical value, with particular attention paid to reporting quality so that equitable comparisons can be made to existing research.

Get the top digital health stories right in your inbox

You might also like..

Select All

You're signed up!

It's great to have you as a reader. Check your inbox for a welcome email.

-- The Digital Health Wire team

You're all set!