A Walsh Analysis of Multilayer Perceptron Function
Kevin Swingler
2014
Abstract
The multilayer perceptron (MLP) is a widely used neural network architecture, but it suffers from the fact that its knowledge representation is not readily interpreted. Hidden neurons take the role of feature detectors, but the popular learning algorithms (back propagation of error, for example) coupled with random starting weights mean that the function implemented by a trained MLP can be difficult to analyse. This paper proposes a method for understanding the structure of the function learned by MLPs that model functions of the class f : f1;1gn ! Rm. The approach characterises a given MLP using Walsh functions, which make the interactions among subsets of variables explicit. Demonstrations of this analysis used to monitor complexity during learning, understand function structure and measure the generalisation ability of trained networks are presented.
References
- Augasta, M. and Kathirvalavakumar, T. (2012). Rule extraction from neural networks - a comparative study. pages 404-408. cited By (since 1996)0.
- Bartlett, E. B. (1994). Dynamic node architecture learning: An information theoretic approach. Neural Networks, 7(1):129-140.
- Baum, E. B. and Haussler, D. (1989). What size net gives valid generalization? Neural Comput., 1(1):151-160.
- Beauchamp, K. (1984). Applications of Walsh and Related Functions. Academic Press, London.
- Castillo, P. A., Carpio, J., Merelo, J., Prieto, A., Rivas, V., and Romero, G. (2000). Evolving multilayer perceptrons.
- Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2):179-211.
- Gorman, R. P. and Sejnowski, T. J. (1988). Analysis of hidden units in a layered network trained to classify sonar targets. Neural Networks, 1(1):75-89.
- Hruschka, E. R. and Ebecken, N. F. (2006). Extracting rules from multilayer perceptrons in classification problems: A clustering-based approach. Neurocomputing, 70(13):384 - 397. Neural Networks Selected Papers from the 7th Brazilian Symposium on Neural Networks (SBRN 7804) 7th Brazilian Symposium on Neural Networks.
- Jian-guo, W., Jian-hong, Y., Wen-xing, Z., and Jin-wu, X. (2008). Rule extraction from artificial neural network with optimized activation functions. In Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on, volume 1, pages 873-879. IEEE.
- Jivani, K., Ambasana, J., and Kanani, S. (2014). A survey on rule extraction approaches based techniques for data classification using neural network. International Journal of Futuristic Trends in Engineering and Technology, 1(1).
- Kamimura, R. (1993). Principal hidden unit analysis with minimum entropy method. In Gielen, S. and Kappen, B., editors, ICANN 1993, pages 760-763. Springer London.
- Krogh, A. and Vedelsby, J. (1994). Neural network ensembles, cross validation, and active learning. In NIPS, pages 231-238.
- Kulluk, S., O zbakir, L., and Baykasog?lu, A. (2013). Fuzzy difaconn-miner: A novel approach for fuzzy rule extraction from neural networks. Expert Systems with Applications, 40(3):938 - 946. FUZZYSS11: 2nd International Fuzzy Systems Symposium 17-18 November 2011, Ankara, Turkey.
- Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. chapter Learning Internal Representations by Error Propagation, pages 318-362. MIT Press, Cambridge, MA, USA.
- Saad, E. and Wunsch II, D. (2007). Neural network explanation using inversion. Neural Networks, 20(1):78-93. cited By (since 1996)22.
- Sanger, D. (1989). Contribution analysis: A technique for assigning responsibilities to hidden units in connectionist networks. Connection Science, 1(2):115-138.
- Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., and Fergus, R. (2013). Intriguing properties of neural networks. CoRR, abs/1312.6199.
- Uphadyaya, B. and Eryurek, E. (1992). Application of neural networks for sensor validation and plant monitoring. Neural Technology, (97):170-176.
- Walsh, J. (1923). A closed set of normal orthogonal functions. Amer. J. Math, 45:5-24.
- Weigend, A. S., Huberman, B. A., and Rumelhart, D. E. (1992). Predicting Sunspots and Exchange Rates with Connectionist Networks. In Casdagli, M. and Eubank, S., editors, Nonlinear modeling and forecasting, pages 395-432. Addison-Wesley.
- Widrow, B. and Lehr, M. (1990). 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proceedings of the IEEE, 78(9):1415-1442.
- Yao, X. (1999). Evolving artificial neural networks. In Proceedings of the IEEE, volume 87, pages 1423-1447.
Paper Citation
in Harvard Style
Swingler K. (2014). A Walsh Analysis of Multilayer Perceptron Function . In Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2014) ISBN 978-989-758-054-3, pages 5-14. DOI: 10.5220/0004974800050014
in Bibtex Style
@conference{ncta14,
author={Kevin Swingler},
title={A Walsh Analysis of Multilayer Perceptron Function},
booktitle={Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2014)},
year={2014},
pages={5-14},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004974800050014},
isbn={978-989-758-054-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2014)
TI - A Walsh Analysis of Multilayer Perceptron Function
SN - 978-989-758-054-3
AU - Swingler K.
PY - 2014
SP - 5
EP - 14
DO - 10.5220/0004974800050014