ACKNOWLEDGEMENTS
The author would like to thank the members of the
Semantic Engineering staff at Agilex Technologies
who participated in the reported testing. He would
also like to thank the anonymous referees, who made
suggestions that significantly improved the quality
of the paper.
REFERENCES
Bradford, R., 2009. Comparability of LSI and human
judgment in text analysis tasks. Proceedings, Applied
Computing Conference, Athens, Greece, 359-366.
Bradford, R., 2011. Implementation techniques for large-
scale latent semantic indexing applications.
Proceedings, ACM Conference on Information and
Knowledge Management, Glasgow, Scotland,
October, 2011.
Broschart, A., Berberich, K., Schenkel, R., 2010.
Evaluating the potential of explicit phrases for
retrieval quality. Proceedings, ECIR 2010, 623-626.
Dumais, S., 2004. Latent semantic analysis. ARIST
Review of Information Science and Technology, vol.
38, Chapter 4.
Dumais, S., et al, 1988. Using latent semantic analysis to
improve access to textual information. Proceedings,
CHI 88, June 15-19, 1988, Washington, DC, 281-285.
Fagan, J., 1989. The effectiveness of a nonsyntactic
approach to automatic phrase indexing for document
retrieval. JASIS, 40(2), 115-132.
Furnas, G., et al, 1988. Information retrieval using a
singular value decomposition model of latent semantic
structure. Proceedings 11
th
SIGIR, 465-480.
Grönqvist, L., 2005A. An evaluation of bi- and tri-gram
enriched latent semantic vector models. ELECTRA
Workshop, Methodologies and Evaluation of Lexical
Cohesion Techniques in Real-world Applications,
Salvador, Brazil, 19 August, 2005, 57–62.
Grönqvist, L., 2005B. Evaluating latent semantic vector
models with synonym tests and document retrieval.
ELECTRA Workshop, Methodologies and Evaluation
of Lexical Cohesion Techniques in Real-world
Applications, Salvador, Brazil, 19 August, 2005, 86–
88.
Grönqvist, L., 2006. Exploring Latent Semantic Vector
Models Enriched With N-grams. PhD Thesis, Växjö
University, Sweden.
Harmon, D., 2005. The TREC ad hoc experiments. In
TREC: Experiment and Evaluation in Information
Retrieval, Voorhees and Harmon, eds, MIT Press.
Hulth, A., 2004. Combining Machine Learning and
Natural Language Processing for Automatic Keyword
Extraction. Thesis, Stockholm University, April,
2004.
Jiang, M., et al, 2004. Choosing the right bigrams for
information retrieval. Proceeding of the Meeting of
the International Federation of Classification
Societies, 2004, 531-540.
Kim, H-R., and Chan, P., 2004. Identifying variable-
length meaningful phrases with correlation functions.
Proceedings, ICTAI, 2004, 16th IEEE International
Conference on Tools with Artificial Intelligence, 30-
38.
Kraaij, W., and Pohlmann, R., 1998. Comparing the
effects of syntactic vs. statistical phrase indexing
strategies for Dutch. Proceedings, ECDL 98, LNCS
1513, 605-617.
Lizza, M., and Sartoretto, F., 2001. A comparative
analysis of LSI strategies. In Computational
Information Retrieval, M. Berry ed., SIAM, 171-181.
Manning, C., Raghavan, P., and Schütze, H., 2008.
Introduction to Information Retrieval, Cambridge
University Press, 36.
Metzler, D., Strohman, T., Croft, W., 2006. Indri at TREC
2006: lessons learned from three terabyte tracks.
Proceedings, Fifteenth Text REtrieval Conference,
NIST Special Publication SP 500-272.
Mitra, M., et al, 1997. An analysis of statistical and
syntactic phrases. Proceedings of RIAO 97, Montreal,
Canada, 200-214.
Nakov, P., Valchanova, E., and Angelova, G. 2003.
Towards deeper understanding of the LSA
performance. In Proceedings, Recent Advances in
Natural Language Processing, 2003, 311-318.
Ogawa, Y., et al, 2000. Structuring and expanding queries
in the probabilistic model. Proceedings, Ninth Text
REtrieval Conference (TREC-9), NIST Special
Publication 500-249, 427-435.
Olney, A. 2009. Generalizing latent semantic analysis. In
Proceedings, 2009 IEEE International Conference on
Semantic Computing, 40-46.
Salton, G., Yang, C., Yu, T., 1975. A theory of term
importance in automatic text analysis. JASIS, 26(1),
33-44.
Turpin, A., and Moffat, A., 1999. Statistical phrases for
vector-space information retrieval. Proceedings,
SIGIR 99, Berkley, CA, August 1999, 309-310.
Weimer-Hastings, P. 2000. Adding syntactic information
to LSA. In Proceedings of the 22
nd
Annual Meeting
of the Cognitive Science Society.
Wu, H., and Gunopulos, D. 2002. Evaluating the utility
of statistical phrases and latent semantic indexing for
text classification. Proceedings ICDM, 713-716.
Zhai, C., et al, 1996. Evaluation of syntactic phrase
indexing – CLARIT NLP track report. In
Proceedings, Fifth TExt Retrieval Conference, NIST
Special Publication 500-238, 347-358.
KDIR2014-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval
70