
European Conference, ECML PKDD 2016, Riva del
Garda, Italy, September 19-23, 2016, Proceedings,
Part II, volume 9852 of Lecture Notes in Computer
Science, pages 179–194. Springer.
Leathart, T. M. (2019). Tree-structured multiclass probabil-
ity estimators. PhD thesis, The University of Waikato,
Hamilton, New Zealand. Doctoral.
Lindauer, M., Eggensperger, K., Feurer, M., Biedenkapp,
A., Deng, D., Benjamins, C., Ruhkopf, T., Sass, R.,
and Hutter, F. (2022). Smac3: A versatile bayesian op-
timization package for hyperparameter optimization.
Journal of Machine Learning Research, 23(54):1–9.
Malerba, D., Appice, A., Bellino, A., Ceci, M., and Pal-
lotta, D. (2001). Stepwise induction of model trees.
In Esposito, F., editor, AI*IA 2001: Advances in Ar-
tificial Intelligence, pages 20–32, Berlin, Heidelberg.
Springer Berlin Heidelberg.
Maso, G., Businelli, C., Piccoli, M., Montico, M., De Seta,
F., Sartore, A., and Alberico, S. (2012). The clini-
cal interpretation and significance of electronic fetal
heart rate patterns 2 h before delivery: an institutional
observational study. Archives of Gynecology and Ob-
stetrics, 286(5):1153–1159.
Mohr, F., Wever, M., and H
¨
ullermeier, E. (2018). Reduction
stumps for multi-class classification. In Duivesteijn,
W., Siebes, A., and Ukkonen, A., editors, Advances in
Intelligent Data Analysis XVII, pages 225–237, Cham.
Springer International Publishing.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
Quinlan, J. R. (1986). Induction of decision trees. Machine
Learning, 1(1):81–106.
Quinlan, J. R. et al. (1992). Learning with continuous
classes. In 5th Australian joint conference on artifi-
cial intelligence, volume 92, pages 343–348. World
Scientific.
Rey, D. and Neuh
¨
auser, M. (2011). Wilcoxon-Signed-Rank
Test, pages 1658–1659. Springer Berlin Heidelberg,
Berlin, Heidelberg.
Singh, J. (2023). Computational complexity and analysis of
supervised machine learning algorithms. In Kumar,
R., Pattnaik, P. K., and R. S. Tavares, J. M., editors,
Next Generation of Internet of Things, pages 195–206,
Singapore. Springer Nature Singapore.
Torgo, L. (1997a). Functional models for regression tree
leaves. In ICML, volume 97, pages 385–393. Citeseer.
Torgo, L. (1997b). Kernel regression trees. In Poster papers
of the 9th European conference on machine learning
(ECML 97), pages 118–127. Prague, Czech Republic.
Wang, Y. and Witten, I. (1997). Induction of model trees
for predicting continuous classes. Induction of Model
Trees for Predicting Continuous Classes.
Wever, M., Mohr, F., and H
¨
ullermeier, E. (2018). Ensem-
bles of evolved nested dichotomies for classification.
In Aguirre, H. E. and Takadama, K., editors, Pro-
ceedings of the Genetic and Evolutionary Computa-
tion Conference, GECCO 2018, Kyoto, Japan, July
15-19, 2018, pages 561–568. ACM.
Wever, M.,
¨
Ozdogan, M., and H
¨
ullermeier, E. (2023). Co-
operative co-evolution for ensembles of nested di-
chotomies for multi-class classification. In Proceed-
ings of the Genetic and Evolutionary Computation
Conference, pages 597–605.
Wolpert, D. H. (1992). Stacked generalization. Neural Net-
works, 5(2):241–259.
Yoon, J., Zame, W. R., Banerjee, A., Cadeiras, M., Alaa,
A. M., and van der Schaar, M. (2018a). Personal-
ized survival predictions via trees of predictors: An
application to cardiac transplantation. PLOS ONE,
13(3):e0194985.
Yoon, J., Zame, W. R., and van der Schaar, M. (2018b).
ToPs: Ensemble learning with trees of predictors.
IEEE Transactions on Signal Processing, 66(8):2141–
2152.
APPENDIX
Hyperparameter Tuning
In this Appendix, we outline the hyperparameter tun-
ing procedure employed to optimise our predictors. It
is important to note that we perform hyperparameter
tuning for each cross-validation split and loss function
evaluated.
We implement our hyperparameter tuning proce-
dure using the state-of-the-art Bayesian Optimisation
framework SMAC3 (Lindauer et al., 2022). We em-
ploy the Hyperparameter Tuning facade, which uses
Random Forest as a surrogate model.
However, due to their substantial runtime, con-
ducting a comprehensive hyperparameter tuning pro-
cedure is not feasible within our constraints for both
PTEs and ToPs. Instead, we focus on tuning the hy-
perparameters of their respective base learners on the
specific dataset at hand. Although the optimal base
learner parameters for standalone use may differ from
those for ensemble methods, we assume that they are
a reasonable approximation.
Furthermore, we choose min leaf samples =
100, slightly higher than (Torgo, 1997a) to mitigate
overfitting. As done in (Yoon et al., 2018b), we set
val1 size = 0.15 and val2 size = 0.1.
In Table 8 we present the search space we set
for hyperparameter optimisation. For hyperparame-
ters not mentioned in this table, we rely on the default
values as provided by the scikit-learn library.
ICPRAM 2025 - 14th International Conference on Pattern Recognition Applications and Methods
68