Learning Kernel Label Decompositions for Ordinal Classification Problems

M. Pérez-Ortiz, P. A. Gutiérrez, C. Hervás-Martínez

2014

Abstract

This paper deals with the idea of decomposing ordinal multiclass classification problems when working with kernel methods. The kernel parameters are optimised for each classification subtask in order to better adjust the kernel to the data. More flexible multi-scale Gaussian kernels are considered to increase the goodness of fit of the kernel matrices. Instead of learning independent models for all the subtasks, the optimum convex combination of the kernel matrices is then obtained, leading to a single model able to better discriminate the classes in the feature space. The results of the proposed algorithm shows promising potential for the acquisition of better suited kernels.

References

  1. Baccianella, S., Esuli, A., and Sebastiani, F. (2009). Evaluation measures for ordinal regression. In Proceedings of the Ninth International Conference on Intelligent Systems Design and Applications (ISDA 09), pages 283-287, Pisa, Italy.
  2. Cardoso, J. S. and da Costa, J. F. P. (2007). Learning to classify ordinal data: The data replication method. Journal of Machine Learning Research, 8:1393-1429.
  3. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3):273-297.
  4. Cristianini, N., Kandola, J., Elisseeff, A., and ShaweTaylor, J. (2002). On kernel-target alignment. In Advances in Neural Information Processing Systems 14, pages 367-373. MIT Press.
  5. Drineas, P. and Mahoney, M. W. (2005). On the nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res., 6:2153-2175.
  6. Frank, E. and Hall, M. (2001). A simple approach to ordinal classification. In Proc. of the 12th Eur. Conf. on Machine Learning, pages 145-156.
  7. Friedrichs, F. and Igel, C. (2005). Evolutionary tuning of multiple svm parameters. Neurocomputing, 64:107- 117.
  8. Gutiérrez, P. A., Pérez-Ortiz, M., Fernandez-Navarro, F., Sánchez-Monedero, J., and Hervás-Martínez, C. (2012). An Experimental Study of Different Ordinal Regression Methods and Measures. In 7th International Conference on Hybrid Artificial Intelligence Systems (HAIS), volume 7209 of Lecture Notes in Computer Science, pages 296-307.
  9. Ho, T. K. and Basu, M. (2002). Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell., 24(3):289-300.
  10. Hsu, C.-W. and Lin, C.-J. (2002). A comparison of methods for multi-class support vector machines. IEEE Transaction on Neural Networks, 13(2):415-425.
  11. Igel, C., Glasmachers, T., Mersch, B., Pfeifer, N., and Meinicke, P. (2007). Gradient-based optimization of kernel-target alignment for sequence kernels applied to bacterial gene start detection. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 4(2):216-226.
  12. Li, L. and Lin, H.-T. (2007). Ordinal Regression by Extended Binary Classification. In Advances in Neural Inform. Processing Syst. 19.
  13. Pérez-Ortiz, M., Gutiérrez, P., Cruz-Ramírez, M., SánchezMonedero, J., and Hervás-Martínez, C. (2013). Kernelizing the proportional odds model through the empirical kernel mapping. In Rojas, I., Joya, G., and Gabestany, J., editors, Advances in Computational Intelligence, volume 7902 of Lecture Notes in Computer Science, pages 270-279. Springer Berlin Heidelberg.
  14. Pérez-Ortiz, M., Gutiérrez, P. A., Sánchez-Monedero, J., and Hervás-Martínez, C. (2013). Multi-scale Support Vector Machine Optimization by Kernel TargetAlignment. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pages 391-396.
  15. Ramona, M., Richard, G., and David, B. (2012). Multiclass feature selection with kernel gram-matrix-based criteria. IEEE Trans. Neural Netw. Learning Syst., 23(10):1611-1623.
  16. Shashua, A. and Levin (2003). Advances in Neural Information Processing Systems, volume 15, chapter Ranking with large margin principle: Two approaches, pages 937-944. MIT Press, Cambridge.
  17. Sun, B.-Y., Li, J., Wu, D. D., Zhang, X.-M., and Li, W.-B. (2010). Kernel discriminant learning for ordinal regression. IEEE Transactions on Knowledge and Data Engineering, 22:906-910.
  18. Vapnik, V. N. (1998). Statistical Learning Theory. Wiley, 1 edition.
  19. Waegeman, W. and Boullart, L. (2009). An ensemble of weighted support vector machines for ordinal regression. International Journal of Computer Systems Science and Engineering, 3(1):1-7.
  20. Yan, F., Mikolajczyk, K., Kittler, J., and Tahir, M. A. (2010). Combining multiple kernels by augmenting the kernel matrix. In Proc. of the 9th International Workshop on Multiple Classifier Systems (MCS), volume 5997, pages 175-184. Springer.
Download


Paper Citation


in Harvard Style

Pérez-Ortiz M., A. Gutiérrez P. and Hervás-Martínez C. (2014). Learning Kernel Label Decompositions for Ordinal Classification Problems . In Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2014) ISBN 978-989-758-054-3, pages 218-225. DOI: 10.5220/0005079302180225


in Bibtex Style

@conference{ncta14,
author={M. Pérez-Ortiz and P. A. Gutiérrez and C. Hervás-Martínez},
title={Learning Kernel Label Decompositions for Ordinal Classification Problems},
booktitle={Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2014)},
year={2014},
pages={218-225},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005079302180225},
isbn={978-989-758-054-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Neural Computation Theory and Applications - Volume 1: NCTA, (IJCCI 2014)
TI - Learning Kernel Label Decompositions for Ordinal Classification Problems
SN - 978-989-758-054-3
AU - Pérez-Ortiz M.
AU - A. Gutiérrez P.
AU - Hervás-Martínez C.
PY - 2014
SP - 218
EP - 225
DO - 10.5220/0005079302180225