Table 2: Impact of the central components of the ECA optimization algorithm on the accuracy values and optimization times
(mean ±1σ).
Cross-validation acc. Generalization acc. Optimization time [min]
ECA-full 0.935 ± 0.015 0.951 ± 0.013 73.74 ± 20.28
No holistic cross-validation 1.000 ± 0.0 (!) 0.287 ± 0.022 (!) 59.95 ± 9.16
No early discarding system 0.938 ± 0.024 0.933 ± 0.027 276.15 ± 59.64 (!)
No initial population improvement 0.912 ± 0.022 (!) 0.925 ± 0.025 (!) 59.30 ± 14.95
three aspects of the optimization algorithm itself. At
first, it was shown that the incorporation of the feature
transform into the cross-validation process is abso-
lutely necessary. Secondly, the early discarding sys-
tem greatly improvesthe optimization speed while the
resulting accuracyvalues are not affected in a negative
way. And lastly, the incorporation of prior knowledge
about the importance of features into the optimization
algorithm improved the accuracy.
Ultimately, it can be concluded that not only the
amount of optimized components is important, but
also the suitability of the optimization algorithm it-
self. However, further experiments on other datasets
need to be conducted to explore the full variety of ef-
fects regarding the complex interplay of the machine
learning components.
REFERENCES
Ans´otegui, C., Sellmann, M., and Tierney, K. (2009). A
gender-based genetic algorithm for the automatic con-
figuration of algorithms. In Gent, I., editor, Principles
and Practice of Constraint Programming - CP 2009,
volume 5732 of Lecture Notes in Computer Science,
pages 142–157. Springer Berlin Heidelberg.
B¨ack, T. (1996). Evolutionary Algorithms in Theory and
Practice: Evolution Strategies, Evolutionary Pro-
gramming, Genetic Algorithms. Oxford University
Press, Oxford, UK.
Bengio, Y., Courville, A., and Vincent, P. (2013). Represen-
tation learning: A review and new perspectives. Pat-
tern Analysis and Machine Intelligence, IEEE Trans-
actions on, 35(8):1798–1828.
Beyer, H.-G. and Schwefel, H.-P. (2002). Evolution strate-
gies - a comprehensive introduction. Natural Comput-
ing, 1(1):3–52.
Bishop, C. M. and Nasrabadi, N. M. (2006). Pattern recog-
nition and machine learning, volume 1. Springer New
York.
Breiman, L. (2001). Random forests. Machine Learning,
45(1):5–32.
B¨urger, F. and Pauli, J. (2015). Representation optimiza-
tion with feature selection and manifold learning in a
holistic classification framework. In De Marsico, M.
and Fred, A., editors, ICPRAM 2015 - Proceedings of
the International Conference on Pattern Recognition
Applications and Methods, volume 1, pages 35–44,
Lisbon, Portugal. INSTICC, SCITEPRESS.
Genuer, R., Poggi, J.-M., and Tuleau-Malot, C. (2010).
Variable selection using random forests. Pattern
Recognition Letters, 31(14):2225 – 2236.
Howell, D. C. (2006). Statistical Methods for Psychology.
Wadsworth Publishing.
Hu, M.-K. (1962). Visual pattern recognition by moment
invariants. Information Theory, IRE Transactions on,
8(2):179–187.
Huang, H.-L. and Chang, F.-L. (2007). ESVM: Evolution-
ary support vector machine for automatic feature se-
lection and classification of microarray data. Biosys-
tems, 90(2):516 – 528.
Hutter, F., Hoos, H., and Leyton-Brown, K. (2011). Se-
quential model-based optimization for general algo-
rithm configuration. In Coello, C., editor, Learning
and Intelligent Optimization, volume 6683 of Lecture
Notes in Computer Science, pages 507–523. Springer
Berlin Heidelberg.
Jain, A. K., Duin, R. P. W., and Mao, J. (2000). Statistical
pattern recognition: a review. Pattern Analysis and
Machine Intelligence, IEEE Transactions on, 22(1):4–
37.
Juszczak, P., Tax, D., and Duin, R. (2002). Feature scal-
ing in support vector data description. In Proc. ASCI,
pages 95–102. Citeseer.
Ma, Y. and Fu, Y. (2011). Manifold Learning Theory and
Applications. CRC Press.
Ojala, T., Pietikainen, M., and Maenpaa, T. (2002). Mul-
tiresolution gray-scale and rotation invariant texture
classification with local binary patterns. Pattern Anal-
ysis and Machine Intelligence, IEEE Transactions on,
24(7):971–987.
Snoek, J., Larochelle, H., and Adams, R. P. (2012). Prac-
tical bayesian optimization of machine learning algo-
rithms. In Advances in neural information processing
systems, pages 2951–2959.
Thornton, C., Hutter, F., Hoos, H. H., and Leyton-Brown,
K. (2013). Auto-WEKA: Combined selection and
hyperparameter optimization of classification algo-
rithms. In Proc. of KDD-2013, pages 847–855.
Van der Maaten, L. (2014). Matlab Tool-
box for Dimensionality Reduction.
http://lvdmaaten.github.io/drtoolbox/.
Van der Maaten, L., Postma, E., and Van Den Herik, H.
(2009). Dimensionality reduction: A comparative re-
view. Journal of Machine Learning Research, 10:1–
41.
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods