Incremental TextRank - Automatic Keyword Extraction for Text Streams

Rui Portocarrero Sarmento, Mário Cordeiro, Pavel Brazdil, João Gama

Abstract

Text Mining and NLP techniques are a hot topic nowadays. Researchers thrive to develop new and faster algorithms to cope with larger amounts of data. Particularly, text data analysis has been increasing in interest due to the growth of social networks media. Given this, the development of new algorithms and/or the upgrade of existing ones is now a crucial task to deal with text mining problems under this new scenario. In this paper, we present an update to TextRank, a well-known implementation used to do automatic keyword extraction from text, adapted to deal with streams of text. In addition, we present results for this implementation and compare them with the batch version. Major improvements are lowest computation times for the processing of the same text data, in a streaming environment, both in sliding window and incremental setups. The speedups obtained in the experimental results are significant. Therefore the approach was considered valid and useful to the research community.

Download


Paper Citation


in Harvard Style

Sarmento R., Cordeiro M., Brazdil P. and Gama J. (2018). Incremental TextRank - Automatic Keyword Extraction for Text Streams.In Proceedings of the 20th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-298-1, pages 363-370. DOI: 10.5220/0006639703630370


in Bibtex Style

@conference{iceis18,
author={Rui Portocarrero Sarmento and Mário Cordeiro and Pavel Brazdil and João Gama},
title={Incremental TextRank - Automatic Keyword Extraction for Text Streams},
booktitle={Proceedings of the 20th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2018},
pages={363-370},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006639703630370},
isbn={978-989-758-298-1},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 20th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Incremental TextRank - Automatic Keyword Extraction for Text Streams
SN - 978-989-758-298-1
AU - Sarmento R.
AU - Cordeiro M.
AU - Brazdil P.
AU - Gama J.
PY - 2018
SP - 363
EP - 370
DO - 10.5220/0006639703630370