Tobias Berka, Helmut A. Mayer


Predicting the class membership of a set of patterns represented by points in a multi-dimensional space critically depends on their specific distribution. To improve the classification performance, pattern vectors may be transformed. There is a range of linear methods for feature construction, but these are often limited in their performance. Nonlinear methods are a more recent development in this field, but these pose difficult optimization problems. Evolutionary approaches have been used to optimize both linear and nonlinear functions for feature construction. For nonlinear feature construction, a particular problem is how to encode the function in order to limit the huge search space while preserving enough flexibility to evolve effective solutions. In this paper, we present a new method for generating a nonlinear function for feature construction using multi-layer perceptrons whose weights are shaped by evolution. By pre-defining the architecture of the neural network we can directly influence the computational capacity of the function and the number of features to be constructed. We evaluate the suggested neural feature construction on four commonly used data sets and report an improvement in classification accuracy ranging from 4 to 13 percentage points over the performance on the original pattern set.


  1. Aggarwal, C. (2010). The Generalized Dimensionality Reduction Problem. In Proceedings of the SIAM International Conference on Data Mining, SDM 2010, pages 607-618. SIAM.
  2. Aizerman, A., Braverman, E. M., and Rozoner, L. I. (1964). Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning. Automation and Remote Control, 25:821-837.
  3. Bäck, T. (1996). Evolutionary Algorithms in Theory and Practice. Oxford University Press.
  4. Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford University Press, Inc., New York, NY, USA.
  5. Chin, T.-J. and Suter, D. (2006). Incremental Kernel PCA for Efficient Non-linear Feature Extraction. In Proceedings of the 17th British Machine Vision Conference, pages 939-948. British Machine Vision Association.
  6. Coelho, A., Weingaertner, D., and von Zuben, F. J. (2001). Evolving Heteregeneous Neural Networks for Classification Problems. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 266- 273, San Francisco. Morgan Kaufmann.
  7. Cortez, P., Cerdeira, A., Almeida, F., Matos, T., and Reis, J. (2009). Modeling Wine Preferences by Data Mining from Physicochemical Properties. Decision Support Systems, 47(4):547-553.
  8. Frank, A. and Asuncion, A. (2010). UCI machine learning repository.
  9. Guo, H. and Nandi, A. K. (2006). Breast Cancer Diagnosis using Genetic Programming Generated Feature. Pattern Recognition, 39(5):980-987.
  10. Ittner, A. and Schlosser, M. (1996). Discovery of Relevant New Features by Generating Non-Linear Decision Trees. In Proceedings of the Second International Conference on Knowledge Discovery and Data Minin, pages 108-113. AAAI.
  11. John, G. H., Kohavi, R., and Pfleger, K. (1994). Irrelevant Features and the Subset Selection Problem. In International Conference on Machine Learning, pages 121-129.
  12. Kim, K. I., Franz, M. O., and Scholkopf, B. (2005). Iterative Kernel Principal Component Analysis for Image Modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(9):1351-1366.
  13. Mayer, A. and Mayer, H. A. (2006). Multi-Chromosomal Representations in Neuroevolution. In Proceedings of the Second IASTED International Conference on Computational Intelligence. ACTA Press.
  14. Mayer, H. A. and Schwaiger, R. (2002). Differentiation of Neuron Types by Evolving Activation Function Templates for Artificial Neural Networks. In Proceedings of the 2002 World Congress on Computational Intelligence, International Joint Conference on Neural Networks, pages 1773-1778. IEEE.
  15. Schölkopf, B., Smola, A., and Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299-1319.
  16. Scott, M. J. J., Niranjan, M., and Prager, R. W. (1998). Realisable Classifiers: Improving Operating Performance on Variable Cost Problems. In Proceedings of the British Machine Vision Conference 1998. British Machine Vision Association.
  17. Sigillito, V. G., Wing, S. P., Hutton, L. V., and Baker, K. B. (1989). Classification of Radar Returns from the Ionosphere using Neural Networks. Johns Hopkins APL Tech. Dig, 10:262-266.
  18. Street, W. N., Wolberg, W. H., and Mangasarian, O. L. (1993). Nuclear Feature Extraction for Breast Tumor Diagnosis. In IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905 of Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, pages 861-870.
  19. Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc., New York, NY, USA.
  20. Yao, X. (1999). Evolving Artificial Neural Networks. Proceedings of the IEEE, 87(9):1423-1447.
  21. Ziemke, T., Carlsson, J., and Bodén, M. (1999). An Experimental Comparison of Weight Evolution in Neural Control Architectures for a 'GarbageCollecting' Khepera Robot. In Proceedings of the 1st International Khepera Workshop. HNIVerlagsschriftenreihe.

Paper Citation

in Harvard Style

Berka T. and Mayer H. (2012). NONLINEAR FEATURE CONSTRUCTION WITH EVOLVED NEURAL NETWORKS FOR CLASSIFICATION PROBLEMS . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8425-98-0, pages 35-44. DOI: 10.5220/0003754200350044

in Bibtex Style

author={Tobias Berka and Helmut A. Mayer},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

in EndNote Style

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
SN - 978-989-8425-98-0
AU - Berka T.
AU - Mayer H.
PY - 2012
SP - 35
EP - 44
DO - 10.5220/0003754200350044