DATABASES REDUCTION

Roberto Ruiz, José C. Riquelme, Jesús S. Aguilar-Ruiz

Abstract

Progress in digital data acquisition and storage technology has resulted in the growth of huge databases. Nevertheless, these techniques often have high computational cost. Then, it is advisable to apply a preprocessing phase to reduce the time complexity. These preprocessing techniques are fundamentally oriented to either of the next goals: horizontal reduction of the databases or feature selection; and vertical reduction or editing. In this paper we present a new proposal to reduce databases applying sequentially vertical and horizontal reduction technics. They are based in our original works, and they use a projection concept as a method to choose examples and representative features. Results are very satisfactory, because the reduced database offers the same intrinsic performance for the later application of classification techniques with low computational resources.

References

  1. Aguilar, J. S., Riquelme, J. C., and Toro, M. (2000). Data set editing by ordered projection. In Proceedings of the 14th European Conference on Arti cial Intelligence, pages 251-255, Berlin, Germany.
  2. Blake, C. and Merz, E. K. (1998). Uci repository of machine learning databases.
  3. Cover, T. M. and Hart, P. E. (1967). Nearest neighbor pattern classi cation. IEEE Transactions on Information Theory, IT-13(1):21-27.
  4. Waikato, Hart, P. (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3):515-516.
  5. Kira, K. and Rendell, L. (1992). A practical approach to feature selection. In International Conference on Machine Learning, pages 368-377.
  6. Klee, V. (1980). On the complexity of d-dimensional voronoi diagrams. Arch. Math., 34:75-80.
  7. Kohavi, R. and John, G. H. (1997). Wrappers for feature subset selection. Arti cial Intalligence, 1-2:273-324.
  8. Kononenko, I. (1994). Estimating attributes: Analysis and estensions of relief. In European Conference on Machine Learning, pages 171-182.
  9. Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes. In Proceedings of the Seventh IEEE International Conference on Tools with Arti cial Intelligence.
  10. Liu, H. and Setiono, R. (1996). Feature selection and classi cation: a probabilistic wrapper approach. In Proceedings of the IEA-AIE.
  11. Pagallo, G. and Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5:71-99.
  12. Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1:81-106.
  13. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann, San Mateo, California.
  14. Riquelme, J., Aguilar-Ruiz, J. S., and Toro, M. (2003). Finding representative patterns with ordered projections. Pattern Recognition, 36(4):1009-1018.
  15. Ritter, G., Woodruff, H., Lowry, S., and Isenhour, T. (1975). An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory, 21(6):665-669.
  16. Ruiz, R., Riquelme, J., and Aguilar-Ruiz, J. S. (2002). Projection-based measure for ef cient feature selection. Journal of Intelligent and Fuzzy System, 12(3- 4):175-183.
  17. Tomek, I. (1976). An experiment with the edited nearestneighbor rule. IEEE Transactions on Systems, Man and Cybernetics, 6(6):448-452.
  18. Toussaint, G. T. (1980). The relative neighborhood graph of a nite planar set. Pattern Recognition, 12(4):261- 268.
  19. Wilson, D. (1972). Asymtotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics, 2(3):408-421.
Download


Paper Citation


in Harvard Style

Ruiz R., C. Riquelme J. and S. Aguilar-Ruiz J. (2004). DATABASES REDUCTION . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 972-8865-00-7, pages 98-103. DOI: 10.5220/0002632300980103


in Bibtex Style

@conference{iceis04,
author={Roberto Ruiz and José C. Riquelme and Jesús S. Aguilar-Ruiz},
title={DATABASES REDUCTION},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2004},
pages={98-103},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002632300980103},
isbn={972-8865-00-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - DATABASES REDUCTION
SN - 972-8865-00-7
AU - Ruiz R.
AU - C. Riquelme J.
AU - S. Aguilar-Ruiz J.
PY - 2004
SP - 98
EP - 103
DO - 10.5220/0002632300980103