A DYNAMIC WRAPPER METHOD FOR FEATURE DISCRETIZATION AND SELECTION

Artur Ferreira, Mario Figueiredo

2012

Abstract

In many learning problems, an adequate (sometimes discrete) representation of the data is necessary. For instance, for large number of features and small number of instances, learning algorithms may be confronted with the curse of dimensionality, and need to address it in order to be effective. Feature selection and feature discretization techniques have been used to achieve adequate representations of the data, by selecting an adequate subset of features with a convenient representation. In this paper, we propose static and dynamic methods for feature discretization. The static method is unsupervised and the dynamic method uses a wrapper approach with a quantizer and a classifier, and it can be coupled with any static (unsupervised or supervised) discretization procedure. The proposed methods attain efficient representations that are suitable for learning problems. Moreover, using well-known feature selection methods with the features discretized by our methods leads to better accuracy than with the features discretized by other methods or even with the original features.

References

  1. Clarke, E. and Barton, B. (2000). Entropy and MDL discretization of continuous variables for Bayesian belief networks. International Journal of Intelligent Systems, 15(1):61-92.
  2. Cover, T. and Thomas, J. (1991). Elements of information theory. John Wiley & Sons.
  3. Dougherty, J., Kohavi, R., and Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In International Conference Machine Learning - ICML'95, pages 194-202. Morgan Kaufmann.
  4. Duin, R., Juszczak, P., Paclik, P., Pekalska, E., Ridder, D., Tax, D., and Verzakov, S. (2007). PRTools4.1: A Matlab Toolbox for Pattern Recognition. Technical report, Delft Univ. Technology.
  5. Escolano, F., Suau, P., and Bonev, B. (2009). Information Theory in Computer Vision and Pattern Recognition. Springer.
  6. Fayyad, U. and Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the International Joint Conference on Uncertainty in AI, pages 1022-1027.
  7. Ferreira, A. and Figueiredo, M. (2011). Unsupervised joint feature discretization and selection. In 5th Iberian Conference on Pattern Recognition and Image Analysis - IbPRIA2011, pages 200-207, Las Palmas, Spain.
  8. Frank, A. and Asuncion, A. (2010). UCI machine learning repository, available at http://archive.ics.uci.edu/ml.
  9. Furey, T., Cristianini, N., Duffy, N., Bednarski, D., Schummer, M., and Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16(10):906-914.
  10. Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182.
  11. Guyon, I., Gunn, S., Nikravesh, M., and Zadeh (Editors), L. (2006). Feature Extraction, Foundations and Applications. Springer.
  12. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning. Springer, 2nd edition.
  13. Kotsiantis, S. and Kanellopoulos, D. (2006). Discretization techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering, 32(1).
  14. Linde, Y., Buzo, A., and Gray, R. (1980). An algorithm for vector quantizer design. IEEE Trans. on Communications, 28:84-94.
  15. Liu, H., Hussain, F., Tan, C., and Dash, M. (2002). Discretization: An Enabling Technique. Data Mining and Knowledge Discovery, 6(4):393-423.
  16. Meyer, P., Schretter, C., and Bontempi, G. (2008). Information-theoretic feature selection in microarray data using variable complementarity. IEEE Journal of Selected Topics in Signal Processing (Special Issue on Genomic and Proteomic Signal Processing), 2(3):261-274.
  17. Peng, H., Long, F., and Ding, C. (2005). Feature selection based on mutual information: Criteria of maxdependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8):1226-1238.
  18. Tsai, C.-J., Lee, C.-I., and Yang, W.-P. (2008). A discretization algorithm based on class-attribute contingency coefficient. Inf. Sci., 178:714-731.
  19. Witten, I. and Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, Morgan Kauffmann, 2nd edition.
  20. Yu, L., Liu, H., and Guyon, I. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5:1205-1224.
  21. Zhu, Q., Lin, L., Shyu, M., and Chen, S. (2011). Effective supervised discretization for classification based on correlation maximization. In IEEE International Conference on Information Reuse and Integration (IRI), pages 390-395, Las Vegas, Nevada, USA.
Download


Paper Citation


in Harvard Style

Ferreira A. and Figueiredo M. (2012). A DYNAMIC WRAPPER METHOD FOR FEATURE DISCRETIZATION AND SELECTION . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8425-98-0, pages 103-112. DOI: 10.5220/0003788201030112


in Bibtex Style

@conference{icpram12,
author={Artur Ferreira and Mario Figueiredo},
title={A DYNAMIC WRAPPER METHOD FOR FEATURE DISCRETIZATION AND SELECTION},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2012},
pages={103-112},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003788201030112},
isbn={978-989-8425-98-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - A DYNAMIC WRAPPER METHOD FOR FEATURE DISCRETIZATION AND SELECTION
SN - 978-989-8425-98-0
AU - Ferreira A.
AU - Figueiredo M.
PY - 2012
SP - 103
EP - 112
DO - 10.5220/0003788201030112