Guidotti, Riccardo, Anna Monreale, Salvatore Ruggieri,
Franco Turini, Fosca Giannotti, and Dino Pedreschi. "A
survey of methods for explaining black box
models." ACM computing surveys (CSUR) 51, no. 5
(2018): 1-42.
Gundersen, O. E. (2020). The Reproducibility Crisis Is
Real. AI Magazine, 41(3), 103-106.
Johnson, A. E., Pollard, T. J., & Mark, R. G. (2017,
November). Reproducibility in critical care: a mortality
prediction case study. In Machine Learning for
Healthcare Conference (pp. 361-376).
Kim, A. A., Zaim, S. R., & Subbian, V. (2020). Assessing
Reproducibility and Veracity across Machine Learning
Techniques in Biomedicine: A Case Study using TCGA
Data. International Journal of Medical Informatics,
104148.
Liaw, A., & Wiener, M. (2002). Classification and
regression by Random Forest. R news, 2(3), 18-22.
Lipton, Z. C. (2018). The mythos of model interpretability.
Queue, 16(3), 31-57.
Liu, Y., Chen, P. H. C., Krause, J., & Peng, L. (2019). How
to read articles that use machine learning: users’ guides
to the medical literature. Jama, 322(18), 1806-1816.
Liu, X., Rivera, S. C., Moher, D., Calvert, M. J., &
Denniston, A. K. (2020). Reporting guidelines for
clinical trial reports for interventions involving
artificial intelligence: the CONSORT-AI extension.
bmj, 370.
Lundberg, S. M., & Lee, S. I. (2017). A unified approach to
interpreting model predictions. In Advances in neural
information processing systems (pp. 4765-4774).
Luo, Yen-Fu, and Anna Rumshisky. "Interpretable topic
features for post-icu mortality prediction." In AMIA
Annual Symposium Proceedings, vol. 2016, p. 827.
American Medical Informatics Association, 2016.
Luo W, Phung D, Tran T, Gupta S, Rana S, Karmakar C,
Shilton A, Yearwood J, Dimitrova N, Ho TB,
Venkatesh S. Guidelines for developing and reporting
machine learning predictive models in biomedical
research: a multidisciplinary view. Journal of medical
Internet research. 2016;18(12):e323.
Michalski, R. S., "A Theory and Methodology of Inductive
Learning," Chapter in the book, Machine Learning: An
Artificial Intelligence Approach, R. S. Michalski, T. J.
Carbonell and T. M. Mitchell (Eds.), pp. 83-134,
TIOGA Publishing Co., Palo Alto, 1983.
Michalski, R. S., "ATTRIBUTIONAL CALCULUS: A
Logic and Representation Language for Natural
Induction," Reports of the Machine Learning and
Inference Laboratory, MLI 04-2, George Mason
University, Fairfax, VA, April, 2004.
Michalski, R. S. and Wojtusiak, J., "Semantic and Syntactic
Attribute Types in AQ Learning," Reports of the
Machine Learning and Inference Laboratory, MLI 07-
1, George Mason University, Fairfax, VA, 2007.
Moher, D., Hopewell, S., Schulz, K. F., & Montori, V.
(2010). G? tzsche, PC; Devereaux, PJ; Elbourne, D;
Egger, M; Altman, DG; CONSORT 2010 explanation
and elaboration: updated guidelines for reporting
parallel group randomised trials. BMJ, 340, c869.
Morgan, S.L. and Winship C., Counterfactuals and Causal
Inference: Methods and Principles for Social Research,
2nd Edition, Cambridge University Press, 2015.
Pearl, J. Causality, Cambridge University Press, 2000.
Pearl, J. (2019). The seven tools of causal inference, with
reflections on machine learning. Communications of the
ACM, 62(3), 54–60. https://doi.org/10.1145/3241036
Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivière, V.,
Beygelzimer, A., d'Alché-Buc, F., ... & Larochelle, H.
(2020). Improving Reproducibility in Machine
Learning Research (A Report from the NeurIPS 2019
Reproducibility Program). arXiv preprint
arXiv:2003.12206.
Renard, F., Guedria, S., De Palma, N., & Vuillerme, N.
(2020). Variability and reproducibility in deep learning
for medical image segmentation. Scientific Reports,
10(1), 1-16.
Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin,
Carlos.“why should I trust you?”: Explaining the
predictions of any classifier. In Knowledge discovery
and Data Mining (KDD), 2016.
Sciikit-learn website, Probability Calibration: https://scikit-
learn.org/stable/modules/calibration.html
Stevens, L. M., Mortazavi, B. J., Deo, R. C., Curtis, L., &
Kao, D. P. (2020). Recommendations for reporting
machine learning analyses in clinical research.
Circulation: Cardiovascular Quality and Outcomes,
CIRCOUTCOMES-120.
Tonekaboni, S., Joshi, S., McCradden, M. D., &
Goldenberg, A. (2019). What clinicians want:
contextualizing explainable machine learning for
clinical end use. arXiv preprint arXiv:1905.05134.
Vollmer, S., Mateen, B. A., Bohner, G., Király, F. J., Ghani,
R., Jonsson, P., ... & Granger, D. (2020). Machine
learning and artificial intelligence research for patient
benefit: 20 critical questions on transparency,
replicability, ethics, and effectiveness. bmj, 368.
Wicks, P., Liu, X., & Denniston, A. K. (2020). Going on up
to the SPIRIT in AI: will new reporting guidelines for
clinical trials of AI interventions improve their rigour?.
BMC medicine, 18(1), 1-3.
Wojtusiak, J., Michalski, R. S., Kaufman, K. and
Pietrzykowski, J., "Multitype Pattern Discovery Via
AQ21: A Brief Description of the Method and Its Novel
Features," Reports of the Machine Learning and
Inference Laboratory, MLI 06-2, George Mason
University, Fairfax, VA, June, 2006.
Wojtusiak, J., Elashkar, E. and Mogharab Nia, R., "C-
LACE2: computational risk assessment tool for 30-day
post hospital discharge mortality," Health and
Technology, Springer, 2018.
Yu, K. H., Lee, T. L. M., Yen, M. H., Kou, S. C., Rosen,
B., Chiang, J. H., & Kohane, I. S. (2020). Reproducible
Machine Learning Methods for Lung Cancer Detection
Using Computed Tomography Images: Algorithm
Development and Validation. Journal of medical
Internet research, 22(8), e16709.