same word type in different topics often have differ-
ent senses while instances in the same topic often re-
fer to the same thing. Since CO-SLDA can jointly
infer topics and word senses, instances of the same
word in the same topic are more likely to be assigned
the same sense while instances in different topics are
likely to be assigned differently. As a result, word
senses will be better identified. (2) Using topics as a
pseudo feedback will facilitate the target words with
topic-specific senses. For example, the word elec-
tion only has one sense in general cases. However,
in the TDT42 data set, topics are labeled in a more
fine-grained perspective. For example, the following
two sentences are labeled to be from two differenttop-
ics as the countries of elections are different: Ilyescu
Wins Romanian Elections, Ghana Gets New Demo-
cratically Elected President. With the joint inference
of topic and sense, we can induce the word election
with two senses, i.e., election#1 and election#2, re-
lated to the electing process in Romania and Ghana
respectively. By incorporating these topic-specific
senses, election with context word Romania is identi-
fied as election#1 and more likely to be assigned topic
z
1
while election with context word Ghana is identi-
fied as election#2 and more likely to be assigned z
2
.
4 RELATED WORK
In Vector Space Model (VSM), it is assumed that
terms are independent of each other and the seman-
tic relations between terms are ignored.
Recently, models are proposed to represent docu-
ments in a semantic concept space using lexical on-
tologies, i.e. WordNet or Wikipedia (Hotho et al.,
2003; Gabrilovich and Markovitch, 2007; Huang and
Kuo, 2010). However, the lexical ontologies are diffi-
cult to be constructed and their coverage can be lim-
ited. In contrast, topic models are used as an alterna-
tive for discovering latent semantic space in corpora
based on the per topic word distribution. LDA (Blei
et al., 2003) as a classic topic model identifies top-
ics of documents by evaluating word co-occurrences.
Various topic models based on the LDA framework
have been developed (Wang et al., 2007). However,
those models all employ the surface word as the ba-
sic unit for document, which is lack of the word sense
interpretation for topics. Some work attempt to inte-
grate word semantics from lexical resources into topic
models (Boyd-Graber et al., 2007; Chemudugunta
et al., 2008; Guo and Diab, 2011). Alternatively, our
models are fully unsupervised and do not rely on any
external semantic resources, which will be extremely
applicable for resource poor languages and domains.
5 CONCLUSIONS
In this paper, we propose to represent topics with dis-
tributions over word senses. In order to achieve this
purpose in a fully unsupervised manner without re-
lying on any external resources, we model the word
sense as a latent variable and induced it from corpora
via WSI. We design several models for this purpose.
Empirical results verify that the word senses induced
from corpora can facilitate the LDA model in doc-
ument clustering. Specifically, we find the joint in-
ference model (i.e., CO-SLDA) outperforms the stan-
dalone model (SA-SLDA) as they the estimation of
sense and topic can be collaboratively improved.
In future, we will extend the proposed topic mod-
els for the cross-lingual information retrieval tasks.
We believe that word senses induced from multilin-
gual documents will be helpful in cross-lingual topic
modeling.
ACKNOWLEDGEMENTS
This work is supported by NSFC (61272233). We
thank the reviewers for the valuable comments.
REFERENCES
Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). La-
tent dirichlet allocation. J. Mach. Learn. Res., 3:993–
1022.
Boyd-Graber, J. L., Blei, D. M., and Zhu, X. (2007). A topic
model for word sense disambiguation. In EMNLP-
CoNLL, pages 1024–1033. ACL.
Chemudugunta, C., Smyth, P., and Steyvers, M. (2008).
Combining concept hierarchies and statistical topic
models. In Proceedings of the 17th ACM conference
on Information and knowledge management, CIKM
’08, pages 1469–1470, New York, NY, USA. ACM.
Dietz, L., Bickel, S., and Scheffer, T. (2007). Unsupervised
prediction of citation influences. In In Proceedings of
the 24th International Conference on Machine Learn-
ing, pages 233–240.
Gabrilovich, E. and Markovitch, S. (2007). Computing se-
mantic relatedness using wikipedia-based explicit se-
mantic analysis. In Proceedings of the 20th inter-
national joint conference on Artifical intelligence, IJ-
CAI’07, pages 1606–1611, San Francisco, CA, USA.
Morgan Kaufmann Publishers Inc.
Griffiths, T. L. and Steyvers, M. (2004). Finding scientific
topics. PNAS, 101(suppl. 1):5228–5235.
Guo, W. and Diab, M. (2011). Semantic topic models:
combining word distributional statistics and dictio-
nary definitions. In Proceedings of the Conference on
Empirical Methods in Natural Language Processing,
ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence
536