Perceptron Learning for Classification Problems - Impact of Cost-Sensitivity and Outliers Robustness

Philippe Thomas


In learning approaches for classification problem, the misclassification error types may have different impacts. To take into account this notion of misclassification cost, cost sensitive learning algorithms have been proposed, in particular for the learning of multilayer perceptron. Moreover, data are often corrupted with outliers and in particular with label noise. To respond to this problem, robust criteria have been proposed to reduce the impact of these outliers on the accuracy of the classifier. This paper proposes to associate a cost sensitivity weight to a robust learning rule in order to take into account simultaneously these two problems. The proposed learning rule is tested and compared on a simulation example. The impact of the presence or absence of outliers is investigated. The influence of the costs is also studied. The results show that the using of conjoint cost sensitivity weight and robust criterion allows to improve the classifier accuracy.


  1. Aström, K.J., 1980. Maximum likelihood and prediction error methods. Automatica, 16, 551-574.
  2. Bloch G., Theilliol D., Thomas P., 1994. Robust identification of non linear SISO systems with neural networks. System Identification (SYSID'94), a Postprint Volume from the IFAC Symp., Copenhagen, Denmark, July 4-6, M. Blanke, T. Söderström (Eds.), Pergamon, 1995, Vol. 3, pp. 1417-1422.
  3. Bloch G., Thomas P., Theilliol D., 1997. Accommodation to outliers in identification of non linear SISO systems with neural networks. Neurocomputing, 14, 85-99.
  4. Barnett V. Lewis T., 1994, Outliers in statistical data, John Wiley, ISBN 0-471-93094-6, Chichester.
  5. Castro C.L., Braga A.P., 2013. Novel cost-sensitivity approach to improve the multilayer perceptron performance on imbalanced data. IEEE Trans. On Neural Networks and Learning Systems, 24, 6, 888- 899.
  6. Cateni S., Colla V., Vannucci M., 2008. Outlier Detection Methods for Industrial Applications, Advances in Robotics, Automation and Control, Jesus Aramburo and Antonio Ramirez Trevino (Ed.), ISBN: 978-953- 7619-16-9, InTech, Available from: cs_automation_and_control/outlier_detection_method s_for_industrial_applications
  7. Chen D.S., Jain R.C., 1991. A robust back propagation learning algorithm for function approximation. Proc. Third Int. Workshop on Artificial Intelligence and Statistics, Fort Lauderdale, FL, 218-239.
  8. Cybenko G., 1989. Approximation by superposition of a sigmoïdal function. Math. Control Systems Signals 2, 303-314.
  9. Demuth H., Beale P. 1994. Neural networks toolbox user's guide V2.0. The MathWorks, Inc.
  10. Domingos P., 1999. MetaCost: A general method for making classifiers cost sensitive. Proc. of the 5th Int. Conf. on Knowledge Discovery and Data Mining, 78- 150.
  11. Drummond C., Holte R.C., 2000. Exploiting the cost (in)sensitivity of decision tree splitting criteria. Proc. of the 17th Int. Conf. on Machine Learning, 239-246.
  12. Fan W., Stolfo S.J., Zhang J., Chan P.K., 1999. AdaCost: Misclassification cost-sensitive boosting. Proc. of Int. Conf. on Machine Learning, pp. 97-105.
  13. Frénay B., Verleysen M., 2014. Classification in the presence of label noise: a survey. IEEE trans. On Neural Networks and Learning Systems, 25, 845-869.
  14. Funahashi K., 1989. On the approximate realisation of continuous mapping by neural networks. Neural Networks 2, 183-192.
  15. Garcia R.A.V., Marqués A.I., Sanchez J.S., AntonioVelasquez J.A., 2013. Making accurate credit risk predictions with cost-sensitive MLP neural networks in Management Intelligent Systems, Advances in Intelligent Systems and Computing, 220, 1-8.
  16. Geibel, Peter, Brefeld, Ulf, and Wysotzki, Fritz. Perceptron and svm learning with generalized cost models. Intelligent Data Analysis, 8:439-455, 2004
  17. Hand, D, Mannila, H., Smyth, P., 2001. Principles of data mining. The MIT press, Cambridge
  18. Hawkins, D., 1980. Identification of Outliers, Chapman and Hall, London.
  19. Huber P.J., 1964. Robust estimation of a location parameter. Ann. Math. Stat., 35, 73-101.
  20. Kotsiantis, S.B., 2007. Supervised machine learning: a review of classification techniques. Informatica, 31, 249-268.
  21. Liano K., 1996. Robust error for supervised neural network learning with outliers. IEEE Trans. on Neural Networks, 7, 246-250.
  22. Lin Y., Lee Y., Wahba G., 2000. Support vector machines for classification in nonstandard situations. Technical Repport,
  23. Ljung L., 1987. System identification: theory for the user. Prentice-Hall, Englewood Cliffs.
  24. Manwani N., Sastry P.S. 2013. Noise tolerance under risk minimization. IEEE Trans. Cybern., 43, 1146-1151.
  25. Margineantu D., 2002. Class probability estimation and cost-sensitive classification decision. Proc. of the 13th European Conference on Machine Learning, 270-281.
  26. Moore D.S., McCabe G.P., 1999. Introduction to the Practice of Statistics. Freeman & Company.
  27. Nguyen D., Widrow B., 1990. Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. Proc. of the Int. J. Conf. on Neural Networks IJCNN'90, 3, 21-26.
  28. Puthenpura S., Sinha N.K., 1990. A robust recursive identification method. Control-Theory and Advanced Technology 6: 683-695.
  29. Raudys S., Raudis A., 2010. Pairwise costs in multiclass perceptrons. IEEE Tans. On Pattern Analysis and Machine Intelligence, 32, 7, 1324-1328.
  30. Sàez J., Galar M., Luengo J., Herrera F. 2014. Analyzing the presence of noise in multi-class problems: Alleviating its influence with the one-vs-one decomposition, Knowl. and Information Systems, 38, 179-206.
  31. Sun J.W., Zhao F.Y., Wang C.J., Chen S.F., 2007. Identifying and correcting mislabeled training instances. Proc. Future Generat. Commun. Netw., 1, Jeju-Island, South Korea, 244-250.
  32. Swartz T., Haitovsky Y., Vexler A., Yang T., 2004. Bayesian identifiability and misclassification in multinomial data, Can. J. Statist., 32, 285-302.
  33. Thomas P., Bloch G., 1997. Initialization of one hidden layer feed-forward neural networks for non-linear system identification. Proc. of the 15th IMACS World Congress on Scientific Computation, Modelling and Applied Mathematics WC'97 295-300.
  34. Thomas P., Bloch G., Sirou F., Eustache V., 1999. Neural modeling of an induction furnace using robust learning criteria. J. Integrated Computer Aided Engineering, 6, 1, 5-23.
  35. Zadrozny B., Elkan C., 2001. Learning and making decisions when costs and probabilities are both unknown. Proc. of the 7th Int. Conf. on Knowledge Discovery and Data Mining, 203-213.
  36. Zadrozny B., Langford J., Abe N., 2003. 3rd IEEE International Conference on Data Mining, 19-22 November, 435-442.
  37. Zhu X., Wu X., 2004. Class noise vs. attribute noise: A quantitative study. Artif. Intell. Rev., 22, 177-210.

Paper Citation

in Harvard Style

Thomas P. (2015). Perceptron Learning for Classification Problems - Impact of Cost-Sensitivity and Outliers Robustness . In Proceedings of the 7th International Joint Conference on Computational Intelligence - Volume 3: NCTA, (ECTA 2015) ISBN 978-989-758-157-1, pages 106-113. DOI: 10.5220/0005594001060113

in Bibtex Style

author={Philippe Thomas},
title={Perceptron Learning for Classification Problems - Impact of Cost-Sensitivity and Outliers Robustness},
booktitle={Proceedings of the 7th International Joint Conference on Computational Intelligence - Volume 3: NCTA, (ECTA 2015)},

in EndNote Style

JO - Proceedings of the 7th International Joint Conference on Computational Intelligence - Volume 3: NCTA, (ECTA 2015)
TI - Perceptron Learning for Classification Problems - Impact of Cost-Sensitivity and Outliers Robustness
SN - 978-989-758-157-1
AU - Thomas P.
PY - 2015
SP - 106
EP - 113
DO - 10.5220/0005594001060113