Overview of the 1st international competition on
plagiarism detection. In 3rd PAN Workshop.
Uncovering Plagiarism, Authorship and Social
Software Misuse (p. 1).
Forsyth, R. S., & Sharoff, S. (2014). Document
dissimilarity within and across languages: A
benchmarking study. Literary and Linguistic
Computing, 29(1), 6-22.
Gollub, T., Potthast, M., Beyer, A., Busse, M., Rangel, F.,
Rosso, P.. & Stein, B. (2013). Recent trends in digital
text forensics and its evaluation. InInformation Access
Evaluation. Multilinguality, Multimodality, and
Visualization (pp. 282-302). Springer Berlin
Heidelberg.M.
Gomaa, W. H., & Fahmy, A. A. (2013). A survey of text
similarity approaches.International Journal of
Computer Applications, 68(13), pp. 13-18.
Hiemstra, D., & De Vries, A. P. (2000). Relating the new
language models of information retrieval to the
traditional retrieval models.
Hoad, T. C., & Zobel, J. (2003). Methods for identifying
versioned and plagiarized documents. Journal of the
American society for information science and
technology, 54(3), pp.203-215.
Huang, A. (2008, April). Similarity measures for text
document clustering. InProceedings of the sixth New
Zealand computer science research student conference
(NZCSRSC2008), Christchurch, New Zealand (pp. 49-
56).
Johnson, R., & Zhang, T. (2014). Effective use of word
order for text categorization with convolutional neural
networks. arXiv preprint arXiv:1412.1058.
Jones, W. P., & Furnas, G. W. (1987). Pictures of
relevance: A geometric analysis of similarity
measures. Journal of the American society for
information science, 38(6), 420-442.
Ljubešić, N., Boras, D., Bakarić, N., & Njavro, J. (2008,
June). Comparing measures of semantic similarity. In
Information Technology Interfaces, 2008. ITI 2008.
30th International Conference on (pp. 675-682). IEEE
Manning, C. D., Raghavan, P., & Schütze, H. (2008).
Introduction to information retrieval (Vol. 1, p. 496).
Cambridge: Cambridge university press.
Mihalcea, R., Corley, C., & Strapparava, C. (2006, July).
Corpus-based and knowledge-based measures of text
semantic similarity. In AAAI (Vol. 6, pp. 775-780).
Oakes, M. P. (2014). Literary Detective Work on the
Computer (Vol. 12). John Benjamins Publishing
Company.
Polettini, N. (2004). The vector space model in
information retrieval-term weighting problem.
Entropy, 1-9.
Ponte, J. M., & Croft, W. B. (1998, August). A language
modeling approach to information retrieval. In
Proceedings of the 21st annual international ACM
SIGIR conference on Research and development in
information retrieval (pp. 275-281). ACM.
Robertson, S. E. (1977). The probability ranking principle
in IR. Journal of documentation, 33(4), pp.294-304.
Robertson, S. (2004). Understanding inverse document
frequency: on theoretical arguments for IDF. Journal
of documentation, 60(5), 503-520.
Salton, G., Wong, A., & Yang, C. S. (1975). A vector
space model for automatic indexing. Communications
of the ACM, 18(11), 613-620.
Salton, G., & Buckley, C. (1988). Term-weighting
approaches in automatic text retrieval. Information
processing & management, 24(5), pp.513-523.
Singhal, A., Salton, G., Mitra, M., & Buckley, C. (1996A).
Document length normalization. Information
Processing & Management, 32(5), 619-633.
Singhal, A., Buckley, C., & Mitra, M. (1996B). Pivoted
document length normalization. In Proceedings of the
19th annual international ACM SIGIR conference on
Research and development in information
retrieval (pp. 21-29). ACM.
Sparck Jones, K. (1972). A statistical interpretation of
term specificity and its application in retrieval. Journal
of documentation, 28(1), pp.11-21.
Strehl, A., Ghosh, J., & Mooney, R. (2000, July). Impact
of similarity measures on web-page clustering.
In Workshop on Artificial Intelligence for Web Search
(AAAI 2000) (pp. 58-64).
Turney, P. (2001). Mining the web for synonyms: PMI-IR
versus LSA on TOEFL.
Turney, P. & Pantel, P. (2010). From frequency to
meaning: Vector space models of semantics. Journal
of artificial intelligence research, 37(1), pp.141-188.
White, R. W., & Jose, J. M. (2004, July). A study of topic
similarity measures. In Proceedings of the 27th annual
international ACM SIGIR conference on Research and
development in information retrieval (pp. 520-521).
ACM.
Zhang, J., & Korfhage, R. R. (1999). A distance and angle
similarity measure method. Journal of the American
Society for Information Science, 50(9), pp. 772-778.