International Conference on International Conference
on Machine Learning, ICML’96, pages 105–112, San
Francisco, USA. Morgan Kaufmann Publishers Inc.
Dua, D. and Graff, C. (2019). UCI machine learning repos-
itory. http://archive.ics.uci.edu/ml.
Freedman, D. and Diaconis, P. (1981). On the histogram
as a density estimator: L2 theory. Zeitschrift f
¨
ur
Wahrscheinlichkeitstheorie und Verwandte Gebiete,
57(4):453–476.
Hastie, T., Tibshirani, R., and Friedman, J. (2001). The
elements of statistical learning. Springer-Verlag, New
York, USA.
Jain, A. K., Murty, M. N., and Flynn, P. J. (1999). Data
clustering: a review. ACM computing surveys (CSUR),
31(3):264–323.
Li, Y. H., Xu, J. Y., Tao, L., Li, X. F., Li, S., Zeng, X.,
Chen, S. Y., Zhang, P., Qin, C., Zhang, C., Chen, Z.,
Zhu, F., and Chen, Y. Z. (2016). Svm-prot 2016: A
web-server for machine learning prediction of protein
functional families from sequence irrespective of sim-
ilarity. PLOS ONE, 11(8):1–14.
Lin, H.-T., Lin, C.-J., and Weng, R. C. (2007). A note
on platt’s probabilistic outputs for support vector ma-
chines. Machine Learning, 68(3):267–276.
Lucena, B. (2018). Spline-based probability calibration.
arXiv preprint arXiv:1809.07751.
Maiorino, E., Rizzi, A., Sadeghian, A., and Giuliani, A.
(2017). Spectral reconstruction of protein contact net-
works. Physica A: Statistical Mechanics and its Ap-
plications, 471:804 – 817.
Martino, A., Giuliani, A., and Rizzi, A. (2018a). Gran-
ular computing techniques for bioinformatics pat-
tern recognition problems in non-metric spaces. In
Pedrycz, W. and Chen, S.-M., editors, Computational
Intelligence for Pattern Recognition, pages 53–81.
Springer International Publishing, Cham.
Martino, A., Maiorino, E., Giuliani, A., Giampieri, M., and
Rizzi, A. (2017a). Supervised approaches for function
prediction of proteins contact networks from topolog-
ical structure information. In Sharma, P. and Bianchi,
F. M., editors, Image Analysis, pages 285–296, Cham.
Springer International Publishing.
Martino, A., Rizzi, A., and Frattale Mascioli, F. M. (2017b).
Efficient approaches for solving the large-scale k-
medoids problem. In Proceedings of the 9th Inter-
national Joint Conference on Computational Intelli-
gence - Volume 1: IJCCI,, pages 338–347. INSTICC,
SciTePress.
Martino, A., Rizzi, A., and Frattale Mascioli, F. M. (2018b).
Distance matrix pre-caching and distributed computa-
tion of internal validation indices in k-medoids clus-
tering. In 2018 International Joint Conference on
Neural Networks (IJCNN), pages 1–8.
Martino, A., Rizzi, A., and Frattale Mascioli, F. M. (2018c).
Supervised approaches for protein function prediction
by topological data analysis. In 2018 International
Joint Conference on Neural Networks (IJCNN), pages
1–8.
Martino, A., Rizzi, A., and Frattale Mascioli, F. M.
(2019). Efficient approaches for solving the large-
scale k-medoids problem: Towards structured data.
In Sabourin, C., Merelo, J. J., Madani, K., and War-
wick, K., editors, Computational Intelligence: 9th In-
ternational Joint Conference, IJCCI 2017 Funchal-
Madeira, Portugal, November 1-3, 2017 Revised Se-
lected Papers, pages 199–219. Springer International
Publishing, Cham.
Minneci, F., Piovesan, D., Cozzetto, D., and Jones, D. T.
(2013). Ffpred 2.0: Improved homology-independent
prediction of gene ontology terms for eukaryotic pro-
tein sequences. PLOS ONE, 8(5):1–10.
Murphy, A. H. and Winkler, R. L. (1977). Reliability of sub-
jective probability forecasts of precipitation and tem-
perature. Journal of the Royal Statistical Society. Se-
ries C (Applied Statistics), 26(1):41–47.
Naeini, M. P., Cooper, G. F., and Hauskrecht, M. (2015).
Obtaining well calibrated probabilities using bayesian
binning. In Proceedings of the Twenty-Ninth AAAI
Conference on Artificial Intelligence, AAAI’15, pages
2901–2907. AAAI Press.
Niculescu-Mizil, A. and Caruana, R. (2005). Predicting
good probabilities with supervised learning. In Pro-
ceedings of the 22nd international conference on Ma-
chine learning, pages 625–632. ACM.
Parzen, E. (1962). On estimation of a probability den-
sity function and mode. The Annals of Mathematical
Statistics, 33(3):1065–1076.
Platt, J. (2000). Probabilities for sv machines. In Smola,
A. J., Bartlett, P., Sch
¨
olkopf, B., and Schuurmans, D.,
editors, Advances in large margin classifiers, pages
61–74. MIT Press, Cambridge, MA, USA.
Sch
¨
olkopf, B. and Smola, A. J. (2002). Learning with ker-
nels: support vector machines, regularization, opti-
mization, and beyond. MIT Press.
Scott, D. W. (1979). On optimal and data-based histograms.
Biometrika, 66(3):605–610.
Sturges, H. A. (1926). The choice of a class inter-
val. Journal of the American Statistical Association,
21(153):65–66.
The UniProt Consortium (2017). Uniprot: the univer-
sal protein knowledgebase. Nucleic Acids Research,
45(D1):D158–D169.
Wahba, G. (1990). Spline models for observational data,
volume 59. Siam.
Webb, E. C. (1992). Enzyme nomenclature 1992. Recom-
mendations of the Nomenclature Committee of the In-
ternational Union of Biochemistry and Molecular Bi-
ology on the Nomenclature and Classification of En-
zymes. Academic Press, 6 edition.
Zadrozny, B. and Elkan, C. (2001). Obtaining calibrated
probability estimates from decision trees and naive
bayesian classifiers. In Proceedings of the Eigh-
teenth International Conference on Machine Learn-
ing, ICML ’01, pages 609–616, San Francisco, CA,
USA. Morgan Kaufmann Publishers Inc.
Zadrozny, B. and Elkan, C. (2002). Transforming classifier
scores into accurate multiclass probability estimates.
In Proceedings of the eighth ACM SIGKDD interna-
tional conference on Knowledge discovery and data
mining, pages 694–699. ACM.
Calibration Techniques for Binary Classification Problems: A Comparative Analysis
495