SUMMARIZING DOCUMENTS USING FRACTAL TECHNIQUES

M. Dolores Ruiz, Antonio B. Bailón

Abstract

Every day we search new information in the web, and we found a lot of documents which contain pages with a great amount of information. There is a big demand for automatic summarization in a rapid and precise way. Many methods have been used in automatic extraction but most of them do not take into account the hierarchical structure of the documents. A novel method using the structure of the document was introduced by Yang and Wang in 2004. It is based in a fractal view method for controlling the information displayed. We explain its drawbacks and we solve them using the new concept of fractal dimension of a text document to achieve a better diversification of the extracted sentences improving the performance of the method.

References

  1. Buyukkokten O., Garcia-Molina H., P. A. (2001). Seeing the whole in parts: Text summarization for web browsing on handheld devices. In 10th International WWW Conference, Hong Kong.
  2. Camastra F., V. A. (2002). Estimating the intrinsic dimension of data with a fractal-based method. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  3. Daume III H., M. D. (2005). Induction of word and phrase alignments for automatic document summarization. Computational Linguistics, 31 (4):505-530.
  4. Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the Association for Computing Machinery, 16 (2):264-285.
  5. Goldstein J., Kantrowitx M., M. V. C. J. (1999). Summarizing text documents: sentence selection and evaluation metrics. pages 121-128.
  6. Grasberger P., P. I. (1983). Measuring the strangeness of strange attractors. pages 189-208.
  7. Guerrini G., Mesiti M., S. I. (2006). An overview of similarity measures for clustering XML documents. Chapter in Athena Vakali and George Pallis (eds.).
  8. Koike, H. (1995). Fractal views: a fractal-based method for controlling information display. ACM Transactions on Information Systems, 13 (3):305-323.
  9. Kraft, R. (1995). Fractals and dimensions. HTTP-Protocol at www.weihenstephan.de.
  10. Liebovitch, L. S., T. T. (1989). A fast algorithm to determine fractal dimensions by box counting. Physics Letters A, 141 (8,9):386-390.
  11. Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal, pages 159-165.
  12. Mandelbrot, B. B. (1986). Self-affine fractal sets . Pietronero L. & Tosatti E. (eds.): Fractals in Physics, Amsterdam.
  13. Morris G., Kasper G. M., A. D. A. (1992). The effect and limitation of automated text condensing on reading comprehension performance. Information System Research, pages 17-35.
  14. Ruiz M. D., B. A. B. (2006). Fractal dimension of text documents: Application in fractal summarization. In IADIS International Conference WWW/Internet, volume 2, pages 349-353.
  15. Yang C. C., Chen H., H. K. (2003a). Visualization of large category map for internet browsing. Decision Support Systems, 35:89-102.
  16. Yang C. C., W. F. L. (2003b). Fractal summarization for mobile devices to access large documents on the web. In 12th International WWW Conference, Budapest, Hungary.
  17. Yang C. C., W. F. L. (2003c). Fractal summarization: Summarization based on fractal theory. In SIGIR 2003, Toronto, Canada.
  18. Yang C. C., W. F. L. (2004). A relevance feedback model for fractal summarization. Lecture Notes in Computer Science, 3334:368-377.
Download


Paper Citation


in Harvard Style

Dolores Ruiz M. and B. Bailón A. (2007). SUMMARIZING DOCUMENTS USING FRACTAL TECHNIQUES . In Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 4: ICEIS, ISBN 978-972-8865-91-7, pages 26-33. DOI: 10.5220/0002363300260033


in Bibtex Style

@conference{iceis07,
author={M. Dolores Ruiz and Antonio B. Bailón},
title={SUMMARIZING DOCUMENTS USING FRACTAL TECHNIQUES},
booktitle={Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 4: ICEIS,},
year={2007},
pages={26-33},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002363300260033},
isbn={978-972-8865-91-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Ninth International Conference on Enterprise Information Systems - Volume 4: ICEIS,
TI - SUMMARIZING DOCUMENTS USING FRACTAL TECHNIQUES
SN - 978-972-8865-91-7
AU - Dolores Ruiz M.
AU - B. Bailón A.
PY - 2007
SP - 26
EP - 33
DO - 10.5220/0002363300260033