Improving a Fuzzy Discretization Process by Bagging

José Manuel Cadenas, María del Carmen Garrido, Raquel Martínez

2013

Abstract

Classification problems in which the number of attributes is larger than the number of examples are increasingly common with rapid technological advances in data collection. Also numerical data are predominant in real world applications and many algorithms in supervised learning are restricted to discrete attributes. Focusing on these issues, we proposed an improvement in a fuzzy discretization method by means of introduction of a bagging process in the different phases of the method. The bagging process tries to solve problems which can appear with small size datasets. Also we show the benefits that bagging introduces in the method by means of several experiments. The experiments are validated by means of statistical test.

References

  1. Antonelli, M., Ducange, P., Lazzerini, B., and Marcelloni, F. (2011). Learning knowledge bases of multiobjective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity. Soft Computing, 15:2335-2354.
  2. Armengol, E. and García-Cerdana, A. (2012). Refining discretizations of continuous-valued attributes. In The 9th International Conference on Modeling Decisions for Artificial Intelligence, pages 258-269.
  3. Au, W. H., Chan, K. C., and Wong, A. (2006). A fuzzy approach to partitioning continuous attributes for classification. IEEE Tran, Knowledge and Data Engineering, 18(5):715-719.
  4. Bonissone, P. P., Cadenas, J. M., Garrido, M. C., and Díaz-Valladares, R. A. (2010). A fuzzy random forest. International Journal of Approximate Reasoning, 51(7):729-747.
  5. Breiman, L. (1996a). Bagging predictors. Maching Learning, 24(2):123-140.
  6. Breiman, L. (1996b). Heuristics of instability and stabilization in model selection. Annals of Statistics, 24(6):2350-2383.
  7. Cadenas, J. M., Garrido, M. C., Martínez, R., and Bonissone, P. P. (2012a). Extending information processing in a fuzzy random forest ensemble. Soft Computing, 16(5):845-861.
  8. Cadenas, J. M., Garrido, M. C., Martínez, R., and Bonissone, P. P. (2012b). Ofp class: a hybrid method to generate optimized fuzzy partitions for classification. Soft Computing, 16:667-682.
  9. Dimitrova, E. S. and Vera-Licona, M. P. (2010). Discretization of time series data. Journal in Computational Biology, 17(6):853-868.
  10. Frank, A. and Asuncion, A. (2010). UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences.
  11. García, S., Fernández, A., Luengo, J., and Herrera, F. (2009). A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Computing, 13(10):959-977.
  12. Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5(3):299-314.
  13. Jain, A. K. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:4-37.
  14. Kianmehr, K., Alshalalfa, M., and Alhajj, R. (2010). Fuzzy clustering-based discretization for gene expression classification. Knowledge and Information Systems, 24:441-465.
  15. Liu, B., Cui, Q., Jiang, T., and Ma, S. (2004). A combinational feature selection and ensemble neuralnetwork method for classification of gene expression data. BMC Bioinformatics, 5(1).
  16. O'Reilly, C. A. (1982). Variations in decision makers' use of information sources: the impact of quality and accessibility of information. Academy of Management Journal, 25(4):756-771.
  17. Qureshi, T. and Zighed, D. A. (2009). A soft discretization technique for fuzzy decision trees using resampling. Intelligent Data Engineering and Automated Learning - IDEAL 2009, Lecture Notes in Computer Science, 5788:586-593.
  18. Unler, A. and Murat, A. (2010). A discrete particle swarm optimization method for feature selection in binary classification problems. European Journal of Operational Research, 206:528-539.
  19. Wang, C., Wang, M., She, Z., and Cao, L. (2012). Cd: A coupled discretization algorithm. Advances in Knowledge Discovery and Data Mining Lecture Notes in Computer Science, 7302:407-418.
  20. Wang, Y. and Witten, I. (1997). Inducing model trees for continuous classes. In Proceedings of the poster papers of the european conference on machine learning, pages 128-137, Prague.
  21. Zararsiz, G., Elmali, F., and Ozturk, A. (2012). Bagging support vector machines for leukemia classification. IJCSI International Journal of Computer Science Issues, 9(1):355-358.
  22. Zhu, Q., Lin, L., Shyu, M. L., and Chen, S. C. (2011). Effective supervised discretization for classification based on correlation maximization. In IEEE International Conference on Information Reuse and Integration (IRI), pages 390-395.
Download


Paper Citation


in Harvard Style

Manuel Cadenas J., del Carmen Garrido M. and Martínez R. (2013). Improving a Fuzzy Discretization Process by Bagging . In Proceedings of the 5th International Joint Conference on Computational Intelligence - Volume 1: FCTA, (IJCCI 2013) ISBN 978-989-8565-77-8, pages 201-212. DOI: 10.5220/0004553402010212


in Bibtex Style

@conference{fcta13,
author={José Manuel Cadenas and María del Carmen Garrido and Raquel Martínez},
title={Improving a Fuzzy Discretization Process by Bagging},
booktitle={Proceedings of the 5th International Joint Conference on Computational Intelligence - Volume 1: FCTA, (IJCCI 2013)},
year={2013},
pages={201-212},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004553402010212},
isbn={978-989-8565-77-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Joint Conference on Computational Intelligence - Volume 1: FCTA, (IJCCI 2013)
TI - Improving a Fuzzy Discretization Process by Bagging
SN - 978-989-8565-77-8
AU - Manuel Cadenas J.
AU - del Carmen Garrido M.
AU - Martínez R.
PY - 2013
SP - 201
EP - 212
DO - 10.5220/0004553402010212