The dataset is based on six classifiers with differ-
ent theoretical foundations, 83 datasets from differ-
ent domains, and 49 meta-features from six different
groups. The R-script for computing the meta-features
is also publicly available to make extensions of the
meta-dataset easier.
A brief analysis of the gathered data showed that
the accuracy of a specific classifier has a large devi-
ation and that also very simple classifiers like Naive
Bayes are still the best choice for some datasets. Both
aspects make the presented meta-dataset and meta-
learning in general a challenging task.
REFERENCES
Ali, S. and Smith, K. A. (2006). On learning algorithm
selection for classification. Applied Soft Computing,
6:119–138.
Asuncion, A. and Newman, D. (2007).
UCI machine learning repository.
http://www.ics.uci.edu/∼mlearn/MLRepository.html
University of California, Irvine, School of Informa-
tion and Computer Sciences.
Bensusan, H. and Giraud-Carrier, C. (2000a). Casa batl is
in passeig de grcia or how landmark performances can
describe tasks. In Proceedings of the ECML-00 Work-
shop on Meta-Learning: Building Automatic Advice
Strategies for Model Selection and Method Combina-
tion, pages 29–46.
Bensusan, H. and Giraud-Carrier, C. G. (2000b). Discov-
ering task neighbourhoods through landmark learning
performances. In PKDD ’00: Proceedings of the 4th
European Conference on Principles of Data Mining
and Knowledge Discovery, pages 325–330, London,
UK. Springer Berlin / Heidelberg.
Bensusan, H. and Kalousis, A. (2001). Estimating the pre-
dictive accuracy of a classifier. In De Raedt, L. and
Flach, P., editors, Machine Learning: ECML 2001,
volume 2167 of Lecture Notes in Computer Science,
pages 25–36. Springer Berlin / Heidelberg.
Brazdil, P., Soares, C., and da Costa, J. P. (2003). Rank-
ing learning algorithms: Using IBL and meta-learning
on accuracy and time results. Machine Learning,
50(3):251–277.
Brazdil, P. B. and Soares, C. (2000). Zoomed ranking: Se-
lection of classification algorithms based on relevant
performance information. In In Proceedings of Prin-
ciples of Data Mining and Knowledge Discovery, 4th
European Conference, pages 126–135.
Engels, R. and Theusinger, C. (1998). Using a data met-
ric for preprocessing advice for data mining applica-
tions. In Proceedings of the European Conference on
Artificial Intelligence (ECAI-98, pages 430–434. John
Wiley & Sons.
Gama, J. and Brazdil, P. (1995). Characterization of classifi-
cation algorithms. In Pinto-Ferreira, C. and Mamede,
N., editors, Progress in Artificial Intelligence, vol-
ume 990 of Lecture Notes in Computer Science, pages
189–200. Springer Berlin / Heidelberg.
K¨opf, C., Taylor, C., and Keller, J. (2000). Meta-analysis:
From data characterisation for meta-learning to meta-
regression. In Proceedings of the PKDD-00 Workshop
on Data Mining, Decision Support,Meta-Learning
and ILP.
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and
Euler, T. (2006). Yale: Rapid prototyping for com-
plex data mining tasks. In Ungar, L., Craven, M.,
Gunopulos, D., and Eliassi-Rad, T., editors, KDD ’06:
Proceedings of the 12th ACM SIGKDD international
conference on Knowledge discovery and data mining,
pages 935–940, New York, NY, USA. ACM.
Peng, Y., Flach, P., Soares, C., and Brazdil, P. (2002). Im-
proved dataset characterisation for meta-learning. In
Lange, S., Satoh, K., and Smith, C., editors, Discovery
Science, volume 2534 of Lecture Notes in Computer
Science, pages 193–208. Springer Berlin / Heidelberg.
Pfahringer, B., Bensusan, H., and Giraud-Carrier, C. (2000).
Meta-learning by landmarking various learning algo-
rithms. In In Proceedings of the Seventeenth Interna-
tional Conference on Machine Learning, pages 743–
750. Morgan Kaufmann.
Reif, M., Shafait, F., and Dengel, A. (2011). Prediction of
classifier training time including parameter optimiza-
tion. In 34th Annual German Conference on Artificial
Intelligence KI11, Berlin, Germany.
Segrera, S., Pinho, J., and Moreno, M. (2008). Information-
theoretic measures for meta-learning. In Corchado,
E., Abraham, A., and Pedrycz, W., editors, Hybrid Ar-
tificial Intelligence Systems, volume 5271 of Lecture
Notes in Computer Science, pages 458–465. Springer
Berlin / Heidelberg.
Simonoff, J. S. (2003). Analyzing Categorical Data.
Springer Texts in Statistics. Springer Berlin / Heidel-
berg.
Soares, C. and Brazdil, P. B. (2006). Selecting parameters
of SVM using meta-learning and kernel matrix-based
meta-features. In SAC ’06: Proceedings of the 2006
ACM symposium on Applied computing, pages 564–
568, New York, NY, USA. ACM.
Soares, C., Brazdil, P. B., and Kuba, P. (2004). A meta-
learning method to select the kernel width in support
vector regression. Machine Learning, 54(3):195–209.
Sohn, S. Y. (1999). Meta analysis of classification al-
gorithms for pattern recognition. Pattern Analy-
sis and Machine Intelligence, IEEE Transactions on,
21(11):1137 –1144.
Vilalta, R., Giraud-carrier, C., Brazdil, P. B., and Soares,
C. (2004). Using meta-learning to support data min-
ing. International Journal of Computer Science and
Applications, 1(1):31–45.
Vlachos, P. (1998). StatLib datasets archive.
http://lib.stat.cmu.edu Department of Statistics,
Carnegie Mellon University.
ICPRAM 2012 - International Conference on Pattern Recognition Applications and Methods
276