Authors:
Radmila Juric
1
;
Elisabetta Ronchieri
2
;
Gordana Blagojevic Zagorac
3
;
Hana Mahmutefendic
3
and
Pero Lucin
3
Affiliations:
1
University of South Eastern Norway, Kongsberg, Norway
;
2
CNAF/ INFN, Bologna, Italy
;
3
University of Rijeka, Faculty of Medicine, Rojeka, Croatia
Keyword(s):
Training Data Set, Machine Learning, Endocytic Pathways.
Abstract:
Predictive technologies with increased uptake of machine learning algorithms have changed the landscape of computational models across problem domains and research disciplines. With the abundance of data available for computations, we started looking at the efficiency of predictive inference as the answer to many problems we wish to address using computational power. However, the real picture of the effectiveness and suitability of predictive and learning technologies in particular is far from promising. This study addresses these concerns and illustrates them though biomedical experiments which evaluate Tf/TfR endosomal recycling as a part of cellular processes by which cells internalise substances from their environment. The outcome of the study is interesting. The observed data play an important role in answering biomedical research questions because it was feasible to perform ML classifications and feature selection using the semantic stored in the observed data set. However, the
process of preparing the data set for ML classifications proved the opposite. Precise algorithmic predictions, which are ultimate goals when using learning technologies, are not the only criteria which measure the success of predictive inference. It is the semantic of the observed data set, which should become a training data set for ML, which becomes a weak link in the process. The recognised practices from data science do not secure any safety of preserving important semantics of the observed data set and experiments. They could be distorted and misinterpreted and might not contribute towards correct inference. The study can be seen as an illustration of hidden problems in using predictive technologies in biomedicine and is applicable to both: computer and biomedical scientists.
(More)