Naive Bayes Classifier with Mixtures of Polynomials

J. Luengo, Rafael Rumi

Abstract

We present in this paper a methodology for including continuous features in the Naive Bayes classifier by estimating the density function of the continuous variables through the Mixtures of Polynomials model. Three new issues are considered for this model: i) a classification oriented parameter estimation procedure ii) a feature selection procedure and iii) the definition of new kind of variable, to deal with those variables that are in theory continuous, but their behavior makes the estimation difficult. These methods are tested with respect to classical discrete and Gaussian Naive Bayes, as well as classification trees.

References

  1. Aguilera, P. A., Fernández, A., Reche, F., and Rumí, R. (2010). Hybrid Bayesian network classifiers: Application to species distribution models. Environmental Modelling & Software, 25:1630-1639.
  2. Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L., and Herrera, F. (2011). Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing, 17:255-287.
  3. Bache, K. and Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.
  4. Bouckaert, R. (2004). Naive bayes classifiers that perform well with continuous variables. In Proc. of the 17th Australian Conference on Artificial Intelligence, pages 1089 - 1094.
  5. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and regression trees. Chapman & Hall/CRC.
  6. Cowell, R. G., Dawid, A. P., Lauritzen, S. L., and Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science. Springer.
  7. Dash, M. and Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1(3):131 - 156.
  8. Domingos, P. and Pazzani, M. (1996). Beyond independence: Conditions for the optimality of the simple bayesian classifier. In Proceedings of the International Conference on Machine Learning.
  9. Domingos, P. and Pazzani, M. (1997). On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning, 29:103 - 130.
  10. Dougherty, J., Kohavi, R., and Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In y S. Russell, A. P., editor, Machine Learning: Proceedings of the Twelfth International Conference, pages 194-202. Morgan Kaufmann, San Francisco.
  11. Fayyad, U. M. and Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93), pages 1022 - 1027.
  12. Fernández, A. and Salmerón, A. (2008). Extension of Bayesian network classifiers to regression problems. In Geffner, H., Prada, R., Alexandre, I. M., and David, N., editors, Advances in Artificial Intelligence - IBERAMIA 2008, volume 5290 of Lecture Notes in Artificial Intelligence, pages 83-92. Springer.
  13. Friedman, N., Geiger, D., and Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29:131-163.
  14. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The WEKA data mining software: an update. SIGKDD Explor. Newsl., 11(1):10-18.
  15. Hollander, M. and Wolfe, D. A. (1999). Nonparametric Statistical Methods. Wiley, 2nd edition edition.
  16. Jensen, F. V. and Nielsen, T. D. (2007). Bayesian Networks and Decision Graphs. Springer.
  17. John, G. H. and Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh conference on Uncertainty in Artificial Intelligence, pages 338 - 345.
  18. Kozlov, D. and Koller, D. (1997). Nonuniform dynamic discretization in hybrid networks. In Geiger, D. and Shenoy, P., editors, Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence, pages 302-313. Morgan & Kaufmann.
  19. Langseth, H., Nielsen, T. D., Pérez-Bernabé, I., and Salmerón, A. (2013). Learning mixtures of truncated basis functions from data. International Journal of Approximate Reasoning.
  20. Lauritzen, S. and Wermuth, N. (1989). Graphical models for associations between variables, some of which are qualitative and some quantitative. The Annals of Statistics, 17:31-57.
  21. L ópez-Cruz, P. L., Bielza, C., and Larran˜aga, P. (2013). Learning mixtures of polynomials of multidimensional probability densities from data using b-spline interpolation. International Journal of Approximate Reasoning, In Press.
  22. Lucas, P. J. (2002). Restricted Bayesian network structure learning. In Gámez, J. and Salmerón, A., editors, Proceedings of the 1st European Workshop on Probabilistic GraphicalModels (PGM'02), pages 117-126.
  23. Minsky, M. (1963). Steps towards artificial inteligence. Computers and Thoughts, pages 406-450.
  24. Moral, S., Rumí, R., and Salmerón, A. (2001). Mixtures of Truncated Exponentials in Hybrid Bayesian Networks. In Benferhat, S. and Besnard, P., editors, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, volume 2143 of Lecture Notes in Artificial Intelligence, pages 156-167. Springer.
  25. Morales, M., Rodríguez, C., and Salmerón, A. (2007). Selective naïve Bayes for regression using mixtures of truncated exponentials. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 15:697-716.
  26. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan-Kaufmann. San Mateo.
  27. Pérez, A., Larran˜aga, P., and Inza, I. (2009). Bayesian classifiers based on kernel density estimation: Flexible classifiers. International Journal of Approximate Reasoning, 50(2):341 - 362.
  28. R Core Team (2013). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
  29. Romero, V., Rumí, R., and Salmerón, A. (2006). Learning hybrid Bayesian networks using mixtures of truncated exponentials. International Journal of Approximate Reasoning, 42:54-68.
  30. Rumí, R., Salmerón, A., and Moral, S. (2006). Estimating mixtures of truncated exponentials in hybrid Bayesian network. Test, 15:397-421.
  31. Rumí, R., Salmerón, A., and Shenoy, P. P. (2012). Tractable inference in hybrid bayesian networks with deterministic conditionals using re-approximations. In Proceedings of the Sixth European Workshop on Probabilistic Graphical Models (PGM'2012), pages 275 - 282.
  32. Sahami, M. (1996). Learning limited dependence Bayesian classifiers. In KDD96: Proceedings of the second international Conference on Knowledge Discovery and Data Mining, pages 335-338.
  33. Schuster, E. F. (1985). Incorporating support constraints into nonparametric estimators of densities. Communications in Statistics, Part A. Theory and Methods, 14:1123 - 1136.
  34. Shenoy, P. P. (2011). A re-definition of mixtures of polynomials for inference in hybrid Bayesian networks. In Liu, W., editor, Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Lecture Notes in Artificial Intelligence 6717, pages 98-109. Springer.
  35. Shenoy, P. P., Rumí, R., and Salmerón, A. (2011). Some practical issues in inference in hybrid bayesian networks with deterministic conditionals. In Proceedings of the Intelligent Systems Design and Applications (ISDA).
  36. Shenoy, P. P. and West, J. (2011). Inference in hybrid Bayesian networks using mixtures of polynomials. International Journal of Approximate Reasoning, 52:641-657.
  37. Simonoff, J. (1996). Smoothing methods in Statistics. Springer.
Download


Paper Citation


in Harvard Style

Luengo J. and Rumi R. (2015). Naive Bayes Classifier with Mixtures of Polynomials . In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-076-5, pages 14-24. DOI: 10.5220/0005166000140024


in Bibtex Style

@conference{icpram15,
author={J. Luengo and Rafael Rumi},
title={Naive Bayes Classifier with Mixtures of Polynomials},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2015},
pages={14-24},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005166000140024},
isbn={978-989-758-076-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Naive Bayes Classifier with Mixtures of Polynomials
SN - 978-989-758-076-5
AU - Luengo J.
AU - Rumi R.
PY - 2015
SP - 14
EP - 24
DO - 10.5220/0005166000140024