AUTOLSA: AUTOMATIC DIMENSION REDUCTION OF LSA FOR SINGLE-DOCUMENT SUMMARIZATION
Haidi Badr, Nayer Wanas, Magda Fayek
2010
Abstract
The role of text summarization algorithms is increasing in many applications; especially in the domain of information retrieval. In this work, we propose a generic single-document summarizer which is based on using the Latent Semantic Analysis (LSA). Generally in LSA, determining the dimension reduction ratio is usually performed experimentally which is data and document dependent. In this work, we propose a new approach to determine the dimension reduction ratio, DRr, automatically to overcome the manual determination problems. The proposed approach is tested using two benchmark datasets; namely DUC02 and LDC2008T19. The experimental results illustrate that the dimension reduction ratio obtained automatically improves the quality of the text summarization while providing a more optimal value for the DRr.
References
- Ding, C. (2005). A probabilistic model for latent semantic indexing. The American Society for Information Science and Technology, 56:597-608.
- Gong, Y. and Liu, X. (2002). Generic text summarization using relevance measure and latent semantic analysis. In 24th annual international ACM SIGIR conference on Research and development in information retrieval.
- Steinberger, J. and Jezek, K. (2004). Text summarization and singular value decomposition. Springer-Verlag, LNCS, 2457:245-254.
- Steinberger, J. and Kristan, M. (2007). Lsa-based multidocument summarization. In 8th International Workshop on Systems and Control.
- Steinberger, J., Poesio, M., Kabadjov, M., and Jezek, K. (2007). Two uses of anaphora resolution in summarization. Information Processing and Management, 43:1663-1680.
- Yeh, J., Ke, H., Yang, W., and Meng, I. (2005). Text summarization using a trainable summarizer and latent semantic analysis. Information Processing and Management on An Asian digital libraries perspective, 41:75- 95.
Paper Citation
in Harvard Style
Badr H., Wanas N. and Fayek M. (2010). AUTOLSA: AUTOMATIC DIMENSION REDUCTION OF LSA FOR SINGLE-DOCUMENT SUMMARIZATION . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 444-448. DOI: 10.5220/0003091904440448
in Bibtex Style
@conference{kdir10,
author={Haidi Badr and Nayer Wanas and Magda Fayek},
title={AUTOLSA: AUTOMATIC DIMENSION REDUCTION OF LSA FOR SINGLE-DOCUMENT SUMMARIZATION},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={444-448},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003091904440448},
isbn={978-989-8425-28-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - AUTOLSA: AUTOMATIC DIMENSION REDUCTION OF LSA FOR SINGLE-DOCUMENT SUMMARIZATION
SN - 978-989-8425-28-7
AU - Badr H.
AU - Wanas N.
AU - Fayek M.
PY - 2010
SP - 444
EP - 448
DO - 10.5220/0003091904440448