Author:
Riccardo Bellazzi
Affiliation:
Department of Electrical, Computer and Biomedical Engineering, University of Pavia, IRCCS ICS Maugeri, Pavia, Italy
Keyword(s):
.
Abstract:
The increasing success of application of machine and deep learning in many areas of medicine, in particular in imaging diagnostics [1], is pushing towards the implementation of AI-based approaches to extract knowledge from health records data (EHR) [2]. The potential of sophisticated strategies to derive regularities from very large collection of textual data, such as language models, is also generating strong expectations about the capability of extracting information unstructured textual notes as well as in generating biomedical texts [3,4]. The COVID-19 pandemics, being one of the most relevant healthcare challenges synchronously happened worldwide, has represented a strong push towards the timely use of EHR data to characterize the clinical course of the COVID-19 disease. Successful examples are represented by cooperative international efforts, such as the Consortium for Clinical Characterization of COVID-19 by EHR (4CE) initiative [5]. However, EHR data are particularly complex,
due to their multifaceted nature and inherent relationship with the health care organizations generating the data. In a recent paper, Kohane and colleagues summarizing the experience carried on in leading 4CE have identified six main challenges that have proven to be crucial for running EHR-based projects [6]: i) data completeness, ii) data collection and handling, iii) data type, iv) robustness of methods against EHR variability (within and across institutions, countries, and time), v) transparency of data and analytic code, and vi) the need of multidisciplinary approach. Those topics, in the context of structured EHR data, have been recently further systematized by a consensus paper by the European Society of Cardiology and the BigData@Heart consortium that has defined the CODE-EHR best-practice framework for the use of structured electronic health-care records in clinical research [7]. When applying ML to EHR data the above-mentioned aspects become even more important, since data-driven approaches may easily suffer from biases, incompleteness, and lack of contextual information. These problems may lead to models that, even if evaluated with a rigorous statistical testing, can be hardly applicable in practice. As a matter of fact, the “local” nature of the EHR data collection may lead to models that cannot be easily exported in clinical settings other than the one that have generated the training data. For this reason, it is important to provide ML models with additional strategies for self-assessment during clinical use. Recently, reliability has been proposed as an instrument to verify the quality of point predictions, based on two principles: the density principle and the local fit principle [8]. The density principle verifies if the case to be evaluated by the model is similar to examples the training set. The local fit principle verifies that the trained model performs well on training subsets that are similar to the instance under evaluation. Reliability and explainability can be seen as safeguards and instruments towards a more trustworthy use of AI and Machine learning. In this talk all these aspects will be discussed through some examples and a few suggestions will be given for future research in this area.
(More)