A CONCEPTUAL STUDY OF MODEL SELECTION IN CLASSIFICATION - Multiple Local Models vs One Global Model

R. Vilalta, F. Ocegueda-Hernandez, C. Bagaria

Abstract

A key concept in model selection is to understand how model complexity can be modified to improve in generalization performance. One design alternative is to increase model complexity on a single global model (by increasing the degree of a polynomial function); another alternative is to combine multiple local models into a composite model. We provide a conceptual study that compares these two alternatives. Following the Structural Risk Minimization framework, we derive bounds for the maximum number of local models or folds below which the composite model remains at an advantage with respect to the single global model. Our results can be instrumental in the design of learning algorithms displaying better control over model complexity.

References

  1. Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M. (1989). Learnability and the vapnik-chervonenkis dimension. Journal of the ACM, 36(4):929-965.
  2. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121-167.
  3. Eisenstat, D. and Angluin, D. (2007). The vc dimension of k-fold union. Information Processing Letters, 101(5):181-184.
  4. Geman, S., Bienenstock, E., and Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1):1-58.
  5. Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
  6. Kearns, M., Mansour, Y., Ng, A., and Ron, D. (1997). An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7-50.
  7. Reyzin, L. (2006). Lower bounds on the vc dimension of unions of concept classes. Technical Report, Yale University, Department of Computer Science, YALEU/DCS/TR-1349.
  8. Shawe-Taylor, J. and Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge University Press.
  9. Vapnik, V. (1999). The Nature of Statistical Learning Theory. Springer, 2nd edition.
Download


Paper Citation


in Harvard Style

Vilalta R., Ocegueda-Hernandez F. and Bagaria C. (2010). A CONCEPTUAL STUDY OF MODEL SELECTION IN CLASSIFICATION - Multiple Local Models vs One Global Model . In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-674-021-4, pages 113-118. DOI: 10.5220/0002733601130118


in Bibtex Style

@conference{icaart10,
author={R. Vilalta and F. Ocegueda-Hernandez and C. Bagaria},
title={A CONCEPTUAL STUDY OF MODEL SELECTION IN CLASSIFICATION - Multiple Local Models vs One Global Model},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2010},
pages={113-118},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002733601130118},
isbn={978-989-674-021-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A CONCEPTUAL STUDY OF MODEL SELECTION IN CLASSIFICATION - Multiple Local Models vs One Global Model
SN - 978-989-674-021-4
AU - Vilalta R.
AU - Ocegueda-Hernandez F.
AU - Bagaria C.
PY - 2010
SP - 113
EP - 118
DO - 10.5220/0002733601130118