Diffusion Ensemble Classifiers

Alon Schclar, Lior Rokach, Amir Amit

Abstract

We present a novel approach for the construction of ensemble classifiers based on the Diffusion Maps (DM) dimensionality reduction algorithm. The DM algorithm embeds data into a low-dimensional space according to the connectivity between every pair of points in the ambient space. The ensemble members are trained based on dimension-reduced versions of the training set. These versions are obtained by applying the DM algorithm to the original training set using different values of the input parameter. In order to classify a test sample, it is first embedded into the dimension reduced space of each individual classifier by using the Nyström out-of-sample extension algorithm. Each ensemble member is then applied to the embedded sample and the classification is obtained according to a voting scheme. A comparison is made with the base classifier which does not incorporate dimensionality reduction. The results obtained by the proposed algorithms improve on average the results obtained by the non-ensemble classifier.

References

  1. Asuncion, A. and Newman, D. J. (2007). UCI machine learning repository. http://archive.ics.uci.edu/ml/.
  2. Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373-1396.
  3. Bengio, Y., Delalleau, O., Roux, N. L., Paiement, J. F., Vincent, P., and Ouimet, M. (2004). Learning eigenfunctions links spectral embedding and kernel pca. Neural Computation, 16(10):2197-2219.
  4. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123-140.
  5. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1993). Classification and Regression Trees. Chapman & Hall, Inc., New York.
  6. Chung, F. R. K. (1997). Spectral Graph Theory. AMS Regional Conference Series in Mathematics, 92.
  7. Coifman, R. R. and Lafon, S. (2006a). Diffusion maps. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21:5-30.
  8. Coifman, R. R. and Lafon, S. (2006b). Geometric harmonics: a novel tool for multiscale out-of-sample extension of empirical functions. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21:31-52.
  9. Cox, T. and Cox, M. (1994). Multidimensional scaling. Chapman & Hall, London, UK.
  10. Donoho, D. L. and Grimes, C. (2003). Hessian eigenmaps: new locally linear embedding techniques for high-dimensional data. In Proceedings of the National Academy of Sciences, volume 100(10), pages 5591- 5596.
  11. Drucker, H. (1997). Improving regressor using boosting. In Jr., D. H. F., editor, Proceedings of the 14th International Conference on Machine Learning, pages 107- 115. Morgan Kaufmann.
  12. Feher, C., Elovici, Y., Moskovitch, R., Rokach, L., and Schclar, A. (2012). User identity verification via mouse dynamics. Information Sciences, 201:19-36.
  13. Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. machine learning. In Proceedings for the Thirteenth International Conference, pages 148-156, San Francisco. Morgan Kaufmann.
  14. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009). The weka data mining software: An update. SIGKDD Explorations, 11:1.
  15. Ham, J., Lee, D., Mika, S., and Scholköpf, B. (2004). A kernel view of the dimensionality reduction of manifolds. In Proceedings of the 21st International Conference on Machine Learning (ICML'04), pages 369- 376, New York, NY, USA. ACM Press.
  16. Hegde, C., Wakin, M., and Baraniuk, R. G. (2007). Random projections for manifold learning. In Neural Information Processing Systems (NIPS).
  17. Hein, M. and Audibert, Y. (2005). Intrinsic dimensionality estimation of submanifolds in Euclidean space. In Proceedings of the 22nd International Conference on Machine Learning, pages 289-296.
  18. Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24:417-441.
  19. Jimenez, L. O. and Landgrebe, D. A. (1998). Supervised classification in high-dimensional space: geometrical, statistical and asymptotical properties of multivariate data. IEEE Transactions on Systems, Man and Cybernetics, Part C: Applications and Reviews,, 28(1):39- 54.
  20. Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29:1-27.
  21. Kuncheva, L. I. (2004). Diversity in multiple classifier systems (editorial). Information Fusion, 6(1):3-4.
  22. Lafon, S., Keller, Y., and Coifman, R. R. (2006). Data fusion and multicue data matching by diffusion maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:1784-1797.
  23. Leigh, W., Purvis, R., and Ragusa, J. M. (2002). Forecasting the nyse composite index with technical analysis, pattern recognizer, neural networks, and genetic algorithm: a case study in romantic decision support. Decision Support Systems, 32(4):361-377.
  24. Mangiameli, P., West, D., and Rampal, R. (2004). Model selection for medical diagnosis decision support systems. Decision Support Systems, 36(3):247-259.
  25. Nyström, E. J. (1928). Über die praktische aufl ösung von linearen integralgleichungen mit anwendungen auf randwertaufgaben der potentialtheorie. Commentationes Physico-Mathematicae, 4(15):1-52.
  26. Opitz, D. and Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11:169 to 198.
  27. Plastria, F., Bruyne, S., and Carrizosa, E. (2008). Dimensionality reduction for classification. Advanced Data Mining and Applications, 1:411-418.
  28. Polikar, R. (2006). ”ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 6:21 t o 45.
  29. Quinlan, R. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc.
  30. Rokach, L. (2008). Mining manufacturing data using genetic algorithm-based feature set decomposition. International Journal of Intelligent Systems Technologies and Applications, 4(1/2):57-78.
  31. Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. SCIENCE, 290:2323-2326.
  32. Schclar, A. (2008). A diffusion framework for dimensionality reduction. In Soft Computing for Knowledge Discovery and Data Mining (Editors: O. Maimon and L. Rokach), pages 315-325. Springer.
  33. Schclar, A., Averbuch, A., Rabin, N., Zheludev, V., and Hochman, K. (2010). A diffusion framework for detection of moving vehicles. Digital Signal Processing, 20:111-122.
  34. Schclar, A. and Rokach, L. (2009). Random projection ensemble classifiers. In Lecture Notes in Business Information Processing, Enterprise Information Systems 11th International Conference Proceedings (ICEIS'09), pages 309-316, Milan, Italy.
  35. Schclar, A., Tsikinovsky, A., Rokach, L., Meisels, A., and Antwarg, L. (2009). Ensemble methods for improving the performance of neighborhood-based collaborative filtering. In RecSys, pages 261-264.
  36. Schölkopf, B., Smola, A., and Muller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299-1319.
  37. Schölkopf, B. and Smola, A. J. (2002). Learning with Kernels. MIT Press, Cambridge, MA.
  38. Solomatine, D. P. and Shrestha, D. L. (2004). Adaboost.rt: A boosting algorithm for regression problems. In Proceedings of the IEEE International Joint Conference on Neural Networks, pages 1163-1168.
  39. Tenenbaum, J. B., de Silva, V., and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319-2323.
  40. Valentini, G., Muselli, M., and Ruffino, F. (2003). Bagged ensembles of svms for gene expression data analysis. In Proceeding of the International Joint Conference on Neural Networks - IJCNN, pages 1844-1849, Portland, OR, USA. Los Alamitos, CA: IEEE Computer Society.
  41. Vapnik, V. N. (1999). The Nature of Statistical Learning Theory (Information Science and Statistics). Springer.
  42. Yang, Z., Nie, X., Xu, W., and Guo, J. (2006). An approach to spam detection by naïve bayes ensemble based on decision induction. In Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06).
Download


Paper Citation


in Harvard Style

Schclar A., Rokach L. and Amit A. (2012). Diffusion Ensemble Classifiers . In Proceedings of the 4th International Joint Conference on Computational Intelligence - Volume 1: NCTA, (IJCCI 2012) ISBN 978-989-8565-33-4, pages 443-450. DOI: 10.5220/0004102804430450


in Bibtex Style

@conference{ncta12,
author={Alon Schclar and Lior Rokach and Amir Amit},
title={Diffusion Ensemble Classifiers},
booktitle={Proceedings of the 4th International Joint Conference on Computational Intelligence - Volume 1: NCTA, (IJCCI 2012)},
year={2012},
pages={443-450},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004102804430450},
isbn={978-989-8565-33-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Joint Conference on Computational Intelligence - Volume 1: NCTA, (IJCCI 2012)
TI - Diffusion Ensemble Classifiers
SN - 978-989-8565-33-4
AU - Schclar A.
AU - Rokach L.
AU - Amit A.
PY - 2012
SP - 443
EP - 450
DO - 10.5220/0004102804430450