A Keyphrase Extraction Approach for Social Tagging Systems

Felice Ferrara, Carlo Tasso

Abstract

Social tagging systems allow people to classify resources by using a set of freely chosen terms named tags. However, by shifting the classification task from a set of experts to a larger and not trained set of people, the results of the classification are not accurate. The lack of control and guidelines generates noisy tags (i.e. tags without a clear semantics) which deteriorate the precision of the user generated classifications. In order to face this limitation several tools have been proposed in the literature for suggesting to the users tags which properly describe a given resource. In this paper we propose to suggest n-grams (named keyphrases) by following the idea that sequences of two/three terms can better face potential ambiguities. More specifically, in this work, we identify a set of features which characterize n-grams able to describe meaningful aspects reported in Web pages. By means of these features we developed a mechanism which can support people to manually classify Web pages by automatically suggesting meaningful keyphrases expressed in English.

References

  1. Barker, K. and Cornacchia, N. (2000). Using noun phrase heads to extract document keyphrases. In Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence, pages 40-52, London, UK. Springer-Verlag.
  2. Ferrara, F., Pudota, N., and Tasso, C. (2011). A keyphrasebased paper recommender system. In Agosti, M., Esposito, F., Meghini, C., and Orio, N., editors, Digital Libraries and Archives, volume 249 of Communications in Computer and Information Science, pages 14-25. Springer Berlin Heidelberg.
  3. Ferrara, F. and Tasso, C. (2011). Extracting and exploiting topics of interests from social tagging systems. In Proceedings of the International Conference on Adaptive and Intelligent Systems, ICAIS'11, pages 285-296, Berlin, Heidelberg. Springer-Verlag.
  4. Hulth, A. (2003). Improved automatic keyword extraction given more linguistic knowledge. In Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 216-223, Morristown, NJ, USA. Association for Computational Linguistics.
  5. Hulth, A. and Megyesi, B. B. (2006). A study on automatically extracted keywords in text categorization. In ACL-44: Proc. of the 21st Int. Conf. on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 537-544, Morristown, NJ, USA. ACL.
  6. Porter, M. F. (1997). An algorithm for suffix stripping. Readings in information retrieval, pages 313-316.
  7. Pudota, N., Dattolo, A., Baruzzo, A., Ferrara, F., and Tasso, C. (2010). Automatic keyphrase extraction and ontology mining for content-based tag recommendation. International Journal of Intelligent Systems, Special Issue: New Trends for Ontology-Based Knowledge Discovery, 25:1158-1186.
  8. Turney, P. (1999). Learning to extract keyphrases from text. Technical Report ERB-1057, National Research Council, Institute for Information Technology.
  9. Turney, P. D. (2000). Learning algorithms for keyphrase extraction. Information Retrieval, 2(4):303-336.
Download


Paper Citation


in Harvard Style

Ferrara F. and Tasso C. (2012). A Keyphrase Extraction Approach for Social Tagging Systems . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012) ISBN 978-989-8565-29-7, pages 362-365. DOI: 10.5220/0004144203620365


in Bibtex Style

@conference{kdir12,
author={Felice Ferrara and Carlo Tasso},
title={A Keyphrase Extraction Approach for Social Tagging Systems},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012)},
year={2012},
pages={362-365},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004144203620365},
isbn={978-989-8565-29-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012)
TI - A Keyphrase Extraction Approach for Social Tagging Systems
SN - 978-989-8565-29-7
AU - Ferrara F.
AU - Tasso C.
PY - 2012
SP - 362
EP - 365
DO - 10.5220/0004144203620365