2 4 6 8 10 12 14
20 30 40 50 60
2 4 6 8 10 12 14
20 30 40 50 60
2 4 6 8 10 12 14
20 30 40 50 60
2 4 6 8 10 12 14
20 30 40 50 60
Figure 4: Unsupervised clustering results using R
1
and R
2
representations. Vertical axis is accuracy, horizontal axis is
% reduction of dimensions.
for the various words the 10% reduction level corre-
sponds to a dimensionality of 856(hard), 494(inter-
est), 1297(line) and 1304(serve). From these reduced
SVDs, the thereby defined R
1
and R
2
versions of the
context vectors were then used. Figure 4 gives the
results (the 60-40 split was randomly made, and re-
peated 4 times, with the figure summarising the out-
comes over these splits).
This confirms the indications from the tiny 2-
dimensional HCI/Graph example, namely that the
outcomes under the R
1
and R
2
representations are not
identical. In this word clustering context, at each level
of reduction, the outcomes with the R
1
and R
2
repre-
sentations are clearly different. In fact there is a per-
sistent pattern of the R
1
representation giving consis-
tently better outcomes than the R
2
representation.
5 CONCLUSIONS
We have shown that there is a discrepancy amongst
researchers concerning the precise dimensionality re-
duction technique to which they give the name ’LSA’.
The R
1
representation is defined by equations (1) and
(3) whilst the R
2
representation is defined by (2) and
(4), and these alternatives give a different geometry
to the space of reduced representations, manifesting
itself in different nearest-neighbour sets. We showed
that, unsurprisingly, this can lead to different system
outcomes according to which representation, R
1
or
R
2
, is adopted in a given system.
We have not argued for one of these representa-
tions over the other one. Whilst Theorem 2 estab-
lishes that
ˆ
A = U
k
× S
k
× V
′
k
is the optimum rank-k
approximation of A in the sense of minimising the
sum of squared differences between corresponding
matrix positions, there is a good deal of conceptual
clear water between this and consequent ’optimality’
of a particular SVD-based reduction of document vec-
tors in a particular system. This is testified to by the
range of attempts there have been to give a theoretical
justification for an observed system ’optimality’ of a
given deployed SVD-based reduction. Therefore the
R
1
and R
2
alternatives are as theoretically motivated
(or unmotivated) as each other, at least at first glance,
and there is some merit in putting both to the test em-
pirically. What is beyond doubt, though, is that these
R
1
and R
2
altneratives are genuinely different and will
not always give the same empirical outcomes.
REFERENCES
Bartell, B. T., Cottrell, G. W., and Belew, R. K. (1992).
Latent semantic indexing is an optimal special case
of multidimensional scaling. In Proceedings of the
Fifteenth Annual International ACM SIGIR Confer-
ence on Research and Development in Information
Retrieval, pages 161–167. ACM Press.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer,
T. K., and Harshman, R. (1990). Indexing by latent
semantic analysis. Journal of the American Scociety
for Information Science, 41(6):391–407.
Gong, Y. and Liu, X. (2001). Generic text summarization
using relevance measure and latent semantic analysis.
In SIGIR, pages 19–25.
Kontostathis, A. and Pottenger, W. M. (2006). A framework
for understanding latent semantic indexing (lsi) per-
formance. Information Processing and Management,
42(1):56–73.
Landauer, T., Foltz, P., and Laham, D. (1998). An introduc-
tion to latent semantic analysis. Discourse Processes,
25(1):259–284.
Manning, C. D., Raghavan, P., and Sch¨utze, H. (2008). In-
troduction to Information Retrieval. Cambridge Uni-
versity Press.
Papadimitriou, C. H., Raghavan, P., Tamaki, H., and Vem-
pala, S. (2000). Latent semantic indexing: A proba-
bilistic analysis. J. Comput. Syst. Sci., 61(2):217–235.
Rosario, B. (2000). Latent semantic indexing: An overview.
Technical report, Berkeley University. available at
http://people.ischool.berkeley.edu/∼rosario/projects/L
SI.pdf.
Zelikovitz, S. and Hirsh, H. (2001). Using lsi for text clas-
sification in the presence of background text. In Pro-
ceedings of CIKM-01, 10TH ACM International Con-
ference on information and knowledge management,
pages 113–118. ACM Press.
ICPRAM2013-InternationalConferenceonPatternRecognitionApplicationsandMethods
120