TOPIC DETECTION IN BIBLIOGRAPHIC DATABASES

Maria Biryukov

Abstract

Detection of research topics in scientific publications has attracted a lot of attention in the past few years. In this paper we introduce and compare various metrics of topic ranking, which allow to distinguish between general and focused topic terms. We use DBLP as a testbed for our experiments.

References

  1. Banerjee, S. and Pedersen, T. (2003). The design, implementation, and use of the Ngram Statistic Package. In Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, pages 370-381.
  2. Bird, C., Barr, E., Nash, A., Filkov, V., Devanbu, P., and Su, Z. (2009). Structure and dynamics of research collaboration in computer science. In Proceedings of the 2009 SIAM International Conference on Data Mining. SIAM.
  3. Diederich, J. and Balke, W.-T. (2007). The semantic growbag algorithm: Automatically deriving categorization systems. In ECDL, pages 1-13.
  4. Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, pages 61-74.
  5. Jo, Y., Lagoze, C., and Giles, C. L. (2007). Detecting research topics via the correlation between graphs and texts. In KDD, pages 370-379.
  6. Lars Backstrom, D. P. H., Kleinberg, J. M., and Lan, X. (2006). Group formation in large social networks: membership, growth, and evolution. In KDD, pages 44-54.
  7. Mann, G. S., Mimno, D. M., and McCallum, A. (2006). Bibliometric impact measures leveraging topic analysis. In JCDL, pages 65-74.
  8. Manning, C. and H.Schutze (1999). Foundation of statistical natural language processing. The MIT press, London, 2nd edition.
  9. Mei, Q., Cai, D., Zhang, D., and Zhai, C. (2008). Topic modeling with network regularization. In 17 International World Wide Web Conferences (WWW), pages 101-110.
  10. Potencier, F. and Humphrey, M. Lingua: Stop words for several languages. http://search.cpan.org/ creamyg/Lingua-StopWords0.09/lib/Lingua/StopWords.pm.
  11. Sanderson, M. and Croft, W. B. (1999). Deriving concept hierarchies from text. In SIGIR, pages 206-213.
  12. Spark, J. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, pages 11-21.
  13. Wang, X., McCallum, A., and Wei, X. (2007). Topical ngrams: Phrase and topic discovery, with an application to information retrieval. In ICDM, pages 697-702.
  14. Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics of small-world networks. Nature, pages 440-442.
Download


Paper Citation


in Harvard Style

Biryukov M. (2009). TOPIC DETECTION IN BIBLIOGRAPHIC DATABASES . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 236-242. DOI: 10.5220/0002332702360242


in Bibtex Style

@conference{kdir09,
author={Maria Biryukov},
title={TOPIC DETECTION IN BIBLIOGRAPHIC DATABASES},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={236-242},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002332702360242},
isbn={978-989-674-011-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - TOPIC DETECTION IN BIBLIOGRAPHIC DATABASES
SN - 978-989-674-011-5
AU - Biryukov M.
PY - 2009
SP - 236
EP - 242
DO - 10.5220/0002332702360242