proposed in previous studies to search relevant
papers for a given starting scientific paper. However,
none of these could effectively help users find
papers, for example, employing the same theoretical
model or using a different experimental data set.
In this paper, we cast a spotlight on the citation-
reason analysis which has been conducted
previously for other purposes, and propose a method
for classifying and organizing papers based on
citation-reasons. Specifically, we established a paper
database from a real paper corpus, predicted the
citation-reasons between papers by using a machine-
learning based classifier trained on an extensive
hand-annotated set of citations, and finally
visualized the resulting information in a practical
system to help users more efficiently find the most
appropriate papers close to their personal demands.
By using our system, we could expect more accurate
searching results for more context-specific demands,
such as the ones we have raised in Section 1, as long
as we could follow the appropriate citation-reasons
between papers.
To the best of our knowledge, this study is the
first attempt to employ citation-reason analysis in
paper acquisition. Also, compared with previous
studies in citation-reason analysis, our approach
defines different features for machine learning, uses
a more flexible contextual scope, and a much bigger
training data set. We have also established a larger
database covering a longer time span and an open-
access data-source compared to previous work,
targeted at the practical use of our method in paper
acquisition, rather than a sociological study.
Evaluation results using the practical system
have shown the effectiveness of our approach in
paper acquisition. However, we believe that
improvements could be made with more powerful
machine-learning approaches, such as Support
Vector Machine or Conditional Random Fields.
Also, more context-specific experiments should be
conducted to show exactly how effective our idea is
in helping focus the search for relevant papers.
REFERENCES
Baeza-Yates R., Ribeiro-Neto, B. 1999. Modern
Information Retrieval. Addison Wesley.
Dobashi, K., Yamauchi, H., Tachibana, R. 2003. Keyword
Mining and Visualization from Text Corpus for
Knowledge Chain Discovery Support. Technical
Report of IEICE, NLC2003-24. pp.55-60. (in
Japanese).
Kessler, M., 1963. Bibliographic Coupling between
Scientific Papers. Journal of the American
Documentation, Vol.14, No.1, pp.10-25.
Small, H., 1973. Co-citation in the Scientific Paper: A
New Measure of the Relationship between Two
Documents. Journal of the American Society for
Information Science, Vol.24, No.4, pp.265-269.
Miyadera, Y., Taji, A., Oyobe, K., Yokoyama, S., Konya,
H., Yaku, T. 2004. Algorithms for Visualization of
Paper-Relations Based on the Graph Drawing
Problem. IEICE Transactions. J87-D-I(3), pp.398-415.
(in Japanese).
Yoon, S., Kim, S., Park. S. 2010. A Link-based Similarity
Measure for Scientific Paper. In Proceedings of
WWW’2010, pp.1213-1214.
Jeh, G., Widom. J. 2002. SimRank: A Measure of
Structural-Context Similarity. In Proceedings of
International Conference on Knowledge Discovery
and Data Mining, pp.538-543.
Nanba, H., Okumura, M. 1999. Towards Multi-paper
Summarization Using Reference Information. Journal
of Natural Language Processing, Vol.6, No.5, pp.43-
62. (in Japanese).
Nanba, H., Kando, N., Okumura, M. 2001. Classification
of Research Papers Using Citation Links and Citation
Types, IPSJ Journal Vol.42, No.11, pp.2640-2649.
(in Japanese).
Garfield, E. 1979. Citation Index: Its Theory and
Application in Science. Technology and Humanities.
New York, NY:J. Wiley.
Weinstock, M. 1971. Citation Indexs. Encyclopedia of
Ligrary and Information Science, 5:16-40. New York,
NY:Dekker.
Moravcsik M., Poovanalingan, M. 1975. Some Results on
the Function and Quality of Citations. Social Studies
of Science, 5:88-91.
Chubin, D., Moitra, S. 1975. Content Analysis of
References: Adjunct or Alternative to Citation
Counting? Social Studies of Science, 5(4):423-441.
Spiegel-Rosing, I. 1977. Science Studies: Bibliometric and
Content Analysis. Social Studies of Science, 7:97-113.
Oppenheim, C., Susan, P. 1978. Highly Cited Old Papers
and the Reasons Why They Continue to Be Cited.
Journal of the American Society for Information
Science, 29:226-230.
Garzone, M., Robert, F. 2000. Towards an Automated
Citation Classifier. In Proceedings of the 13th
Biennial Conference of the CSCI/SCEIO (AI-2000),
pp.337-346.
Pham, S., Hoffmann, A. 2003. A New Approach for
Scientific Citation Classification Using Cue Phrases.
In Proceedings of the Australian Joint Conference in
Artificial Intelligence, Perth, Australia.
Radoulov. R., 2008. Exploring Automatic Citation
Classification. Master thesis in University of
Waterloo.
Teufel, S., Advaith, S., Dan, T. 2006. Automatic
Classification of Citation Function. In Proceedings of
EMNLP-06.
Teufel, S. 2010. The Structure of Scientific Articles –
Applications to Citation Indexing and Summarization.
CSLI Publications.