There are two limitations.
Firstly, we were not able to answer one of the
crucial questions from endosomal trafficking: “does
pre-EE exists” for many reasons. In order to find out
what is happening in the first 2-3 minutes in each
experiment, we would need to use results from more
experiments (we used only 147 experiments) and try
some other ML algorithms. These are very expensive
experiments, and thus we might re-think the way they
are carried out. Simulating data for replacing values
which can not be measured/obtained for the first two
minutes, must be debated. Imputation used in the
second set of ML experiments did not help to improve
the precision. Also, Endocytosis and Internalizations
semantically overlap and thus they should be
addressed in future work, when defining the
additional semantics of the training data set and
revisiting the algorithm from Figure 1.
Secondly, we could have analyzed the results of
the second set of ML classifiers, which had imputed
mean values, calculated per each row. This would
mean that we are trying to achieve precision in
classification, but we will not know if we are
improving the quality of the data set at the same time.
Would this help us to find out if pre-EE existed?
Immediate future work should address our first set
of limitations. The second set of limitations is a
subject of more complicated debate: is predictive
inference desirable in biomedical science if we could
not guarantee that the semantic of the training data set
will not be distorted. For this particular problem of
endocytic trafficking, unfortunately the answer might
be NO. However, this should not discourage us from
searching for or finding more options where both
predictive and logic inference cohabit (Basulto et al.,
2017). In long term, this could lead towards
discovering new insights in biomedical data
REFERENCES
Acuna, E., Rodriguez, C., 2004. The treatment of missing
values and its effect in the classifier accuracy. Banks D.,
McMorris F. R., Arabie P., Gaul W. (eds)
Classification, Clustering, and Data Mining
Applications. Studies in Classification, Data Analysis,
and Knowledge Organisation. Springer, 2004
Basulto, V. G. Jung, J.C., Schroeder, L., 2017. Probabilistic
Description Logics for Subjective Uncertainty, Journal
of Artificial Intelligence Research 58 (2017) 1-66.
Batista, G., M., Monard, C., 2003. An Analysis of Four
Missing Data Treatment Methods for Supervised
Learning, In Applied Artificial Intelligence, Vol 17,
2003, Issue 5-6 pp 519-533.
Blagojević Zagorac, G., Mahmutefendić, H., G., Maćešić,
S., Karleuš, L. J., Lučin, P., 2017. Quantitative Analysis
of Endocytic Recycling of Membrane Proteins by
Monoclonal Antibody-Based Recycling Assays, In
Journal of Cellular Physiology 232(2017), 3; 463-476.
Craddock, A. J., Browse, R. A. 1986. Reasoning with
Uncertain Knowledge, in UAI'86, Second Conference
on Uncertainty in Artificial Intelligence, pp 57-62
Newgard, C. D. Lewis, R. J., 2015. Missing Data: How to
Best Account for What Is Not Known, Clinical
Review& Education, JAMAGuide to Statistics and
Methods
Danilchanka; N., Juric, R., 2020. The Process of Creating a
Training Data Set: Lessons Learned from Mechanical
Engineering, in SDPS 2018 Workshop of Accountability
of AI Bologna, Italy..
Danilchanka; N., Juric, R., 2020. Reliability of Training
Data Sets for ML Classifiers: a Lesson Learned from
Mechanical Engineering, in Proceedings of the 53
rd
HICSS conference, January 2020.
Dempster, A. P., Ruibn, D. P. 1997. Incomplete Data in
Sample Surveys, Theory and Bibliography, Vol 2 (ed,
W.G. Madow, I. Olkin and D.B. Rubin), 3-10. New York
Academic Press.
Juric, R., 2018. How BIASED Could AI Be? In SDPS 2018
Workshop of Accountability of AI Bologna, Italy.
Juric, R., Ronchierri, E., Blagojević Zagorac, G.,
Mahmutefendić, H., Lučin, P. (20,20. Addressing the
Semantic of Missing Data Values in Training Data Sets
using MVL: A Study of Tf/TfR Endocytic Routes,
under review for the ISMVL 2020 Conference, Japan
May 2020.
Karleušaa, L J., Mahmutefendić,H., Ilić Tomaš, M.,
Blagojević Zagorac, G., Lucin, P., 2018. Landmarks of
endosomal remodelling in the early phase of
cytomegalovirus infection, in Virology 515 (2018) 108–
122
Mahmutefendić, H., Blagojević Zagora, G., Grabušić, K.,
Karleuš, L. J., Maćešić, S, Momburg F., Lučin, P.
(2017) Late endosomal recycling of open MHC-I
conformers, in Journal of cellular physiology, 2017
April, 232(4):872-887.
Mahmutefendić, H., Blagojević Zagora, G., Maćešić, S,
Lučin, P. (2018) Rapid Endosomal Recycling, Book
Chapter, in Open Access Peer Review Chapter,
IntechOpen, https://www.intechopen.com/books/peri
pheral-membrane-proteins/rapid-endosomal-recycling
Ronchieri, E., Juric, R., Canaparo, M., 2019. Sentiment
Analysis for Software Code Assessment. In
proceedings of the 2019 IEEE NPSS Conference,
Manchester, UK.
Rubin, D.B., 1976. “Inference and Missing Data”
Biometrika, Vol. 63, No. 3 (Dec., 1976), pp. 581-592.