UNSUPERVISED ARTIFICIAL NEURAL NETWORKS FOR CLUSTERING OF DOCUMENT COLLECTIONS

Abdel-Badeeh M. Salem, Mostafa M. Syiam, Ayad F. Ayad

Abstract

The Self-Organizing Map (SOM) has shown to be a stable neural network model for high- dimensional data analysis. However, its applicability is limited by the fact that some knowledge about the data is required to define the size of the network. In this paper the Growing Hierarchical SOM (GHSOM) is proposed. This dynamically growing architecture evolves into a hierarchical structure of self–organizing maps according to the characteristics of input data. Furthermore, each map is expanded until it represents the corresponding subset of the data at specific level. We demonstrate the benefits of this novel model using a real world example from the document-clustering domain. Comparison between both models (SOM & GHSOM) was held to explain the difference and investigate the benefits of using GHSOM.

References

  1. T. Kohonen, “Self-organized formation of topologically correct feature maps,” Biol. Cybern. vol. 43, 1982, pp. 59-69.
  2. T.Kohonen, “Self-organizing maps” Berlin, Germany: Springer verlage, 1998.
  3. K. Lagus, T. Honkela, S. Kaski, and T. Kohonen, “Selforganizing maps of document collection: A new approach to interactive exploration” In Proc. Int. Conf. on Knowledge Discovery and Data Mining (KDD-96), Portland, OR, vol.36, 1998, pp. 314-322
  4. D. Merkl, “Exploration of text collections with hierarchical feature maps”. In Proc. Int. ACM SIGIR Conf. on Information Retrieval (SIGIR'97), Philadelphia, PA, vol.62, 1997,pp. 412-419
  5. A. Rauber and D. Merkl, “Finding structure in text archives” In Proc. Europe an Symp. on Artificial Neural Networks (ESANN98), Bruges, Belgium, vol.18, 2000,pp.410-419
  6. B. Fritzke, “Growing self-organizing networks ------- Why?” In Proc. Europ Symp on Artificial Neural Networks (ESANN'96), Bruges, Belgium, vol.16,1998,pp.222-230.
  7. B. Fritzke, “Growing grid: a self-organizing network with constant neighborhood range and adaptation strength” Neural Processing Letters, 1997.
  8. R. Miikkulainen, “Script recognition with hierarchical feature maps” Connection Science, 2, 1995.
  9. M. Salem, M. Syiam, and A. F. Ayad, “Improving selforganizing feature map (SOFM) training algorithm using k-means initialization” In Proc. Int. Conf. on Intelligent Eng. Systems INES, IEEE, vol.40,2003,pp.41-46.
  10. M. Porter, “An algorithm for suffix stripping” Program 14(3), pp. 130-137, 1980.
  11. K. Lagus, and S. Kaski, “Keyword selection method for characterizing text document maps” In Proc of ICANN99, Ninth International Conference on Artificial Neural Networks,IEEE,vol 68, 1999,pp.615- 623
Download


Paper Citation


in Harvard Style

M. Salem A., M. Syiam M. and F. Ayad A. (2004). UNSUPERVISED ARTIFICIAL NEURAL NETWORKS FOR CLUSTERING OF DOCUMENT COLLECTIONS . In Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 972-8865-00-7, pages 383-392. DOI: 10.5220/0002595203830392


in Bibtex Style

@conference{iceis04,
author={Abdel-Badeeh M. Salem and Mostafa M. Syiam and Ayad F. Ayad},
title={UNSUPERVISED ARTIFICIAL NEURAL NETWORKS FOR CLUSTERING OF DOCUMENT COLLECTIONS},
booktitle={Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2004},
pages={383-392},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002595203830392},
isbn={972-8865-00-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Sixth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - UNSUPERVISED ARTIFICIAL NEURAL NETWORKS FOR CLUSTERING OF DOCUMENT COLLECTIONS
SN - 972-8865-00-7
AU - M. Salem A.
AU - M. Syiam M.
AU - F. Ayad A.
PY - 2004
SP - 383
EP - 392
DO - 10.5220/0002595203830392