Development of Domains and Keyphrases along Years

Yaakov HaCohen-Kerner, Meir Hassan

Abstract

This paper presents a methodology (including a detailed algorithm, various development concepts and measures, and stopword lists) for measuring the development of domains and keyphrases along years. The examined corpus contains 1020 articles that were accepted for full presentation in PACLIC along the last 18 years. The experimental results for 5 chosen domains (digital humanities, language resources, machine translation, sentiment analysis and opinion mining, and social media) suggest that development trends of domains and keyphrases can be efficiently measured. Top bigrams and trigrams were found as efficient to identify general trends in NLP domains.

References

  1. Anderson, A., McFarland, D., Jurafsky, D., 2012. Towards a computational history of the ACL: 1980-2008. In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries (pp. 13-21). Association for Computational Linguistics.
  2. Blei, D. M., Ng, A. Y., Jordan, M. I., 2003. Latent dirichlet allocation. The Journal of machine Learning research, 3, 993-1022.
  3. Daudaravicius, V., 2012. Applying collocation segmentation to the ACL Anthology Reference Corpus. In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries (pp. 66-75). Association for Computational Linguistics.
  4. Dietz, L., Bickel, S., Scheffer, T., 2007. Unsupervised prediction of citation influences. In Proc. of ICML.
  5. Garfield, E. 1965. Can citation indexing be automated? In Statistical association methods for mechanical documentation, Symposium Proceedings, Washington edited by M. Stevens. (National Bureau of Standards, Miscellaneous Publication 269, Dec 1964, 15, 1965).
  6. Gerrish, S., Blei, D. M., 2010. A language-based approach to measuring scholarly impact. In Proc. of ICML.
  7. Griffiths, T. L, Steyvers. M., 2004. Finding scientific topics. Proc. of the National Academy of Sciences of the United States of America, 101(Suppl 1):5228.
  8. HaCohen-Kerner, Y., Gross, Z., Masa, A., 2005. Automatic extraction and learning of keyphrases from scientific articles. In Proc. of CICLing (pp. 657-669). Springer Berlin Heidelberg.
  9. HaCohen-Kerner, Y., Stern, I., Korkus, D., Fredj, E., 2007. Automatic machine learning of keyphrase extraction from short html documents written in Hebrew. Cybernetics and Systems: An International Journal, 38(1), 1-21.
  10. Hall, D., Jurafsky, D., Manning, C. D., 2008. Studying the history of ideas using topic models. In Proc. of EMNLP.
  11. Mann, G. S., Mimno. D., McCallum, A., 2006. Bibliometric impact measures leveraging topic analysis. In Proc. of the 6th ACM/IEEE-CS joint conference on Digital libraries. ACM, 2006. p. 65-74.
  12. Omodei, E., Guo, Y., Cointet, J. P., Poibeau, T., 2014A. Social and semantic diversity: Socio-semantic representation of a scientific corpus. EACL 2014, 71.
  13. Omodei, E., Cointet, J-P., Poibeau, T., 2014B. Mapping the natural language processing domain: experiments using the ACL anthology. LREC 2014, the Ninth International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland. ELRA, pp. 2972-2979.
  14. Radev, D., Abu-Jbara, A., 2012. Rediscovering ACL discoveries through the lens of ACL anthology network citing sentences. In Proc. of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries (pp. 1-12). Association for Computational Linguistics.
  15. Reiplinger, M., Schäfer, U., Wolska, M., 2012. Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis. In Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries (pp. 55-65). Association for Computational Linguistics.
  16. Sim, Y., Smith, N. A., Smith, D. A., 2012. Discovering factions in the computational linguistics community. In Proc. of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries (pp. 22-32). Association for Computational Linguistics.
Download


Paper Citation


in Harvard Style

HaCohen-Kerner Y. and Hassan M. (2016). Development of Domains and Keyphrases along Years . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 375-383. DOI: 10.5220/0006077303750383


in Bibtex Style

@conference{kdir16,
author={Yaakov HaCohen-Kerner and Meir Hassan},
title={Development of Domains and Keyphrases along Years},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={375-383},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006077303750383},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Development of Domains and Keyphrases along Years
SN - 978-989-758-203-5
AU - HaCohen-Kerner Y.
AU - Hassan M.
PY - 2016
SP - 375
EP - 383
DO - 10.5220/0006077303750383