more, text pre-processing poses many challenges for
languages of a different typology and with richer mor-
phology than English. For these reasons, we envision
as our possible future goals to further investigate re-
lations between text pre-processing and clustering re-
sults and how to make the whole process as unsuper-
vised and language-independent as possible.
REFERENCES
Aigner, M. (2012). Combinatorial theory, volume 234.
Springer Science & Business Media.
Biemann, C. (2006). Chinese whispers: an efficient graph
clustering algorithm and its application to natural lan-
guage processing problems. In Proceedings of the first
workshop on graph based methods for natural lan-
guage processing, pages 73–80. Association for Com-
putational Linguistics.
Brody, S. and Lapata, M. (2009). Bayesian word sense in-
duction. In Proceedings of the 12th Conference of
the European Chapter of the Association for Compu-
tational Linguistics, pages 103–111. Association for
Computational Linguistics.
Dorow, B. and Widdows, D. (2003). Discovering corpus-
specific word senses. In Proceedings of the tenth
conference on European chapter of the Association
for Computational Linguistics-Volume 2, pages 79–
82. Association for Computational Linguistics.
Fritsch, R. and Piccinini, R. (1990). Cellular structures in
topology, volume 19. Cambridge University Press.
G
¨
ardenfors, P. (2004). Conceptual spaces: The geometry of
thought. MIT press.
Hope, D. and Keller, B. (2013). Maxmax: a graph-based
soft clustering algorithm applied to word sense induc-
tion. In Computational Linguistics and Intelligent Text
Processing, pages 368–381. Springer.
i Cancho, R. F. and Sol
´
e, R. V. (2001). The small
world of human language. Proceedings of the Royal
Society of London. Series B: Biological Sciences,
268(1482):2261–2265.
Lin, D. (1998). Automatic retrieval and clustering of sim-
ilar words. In Proceedings of the 36th Annual Meet-
ing of the Association for Computational Linguistics
and 17th International Conference on Computational
Linguistics-Volume 2, pages 768–774. Association for
Computational Linguistics.
Manandhar, S., Klapaftis, I. P., Dligach, D., and Pradhan,
S. S. (2010). Semeval-2010 task 14: Word sense in-
duction & disambiguation. In Proceedings of the 5th
international workshop on semantic evaluation, pages
63–68. Association for Computational Linguistics.
Manning, C. D. and Sch
¨
utze, H. (1999). Foundations of
statistical natural language processing. MIT press.
Mihalcea, R. and Faruque, E. (2004). Senselearner: Min-
imally supervised word sense disambiguation for all
words in open text. In Proceedings of ACL/SIGLEX
Senseval, volume 3, pages 155–158.
Mucherino, A., Lavor, C., Liberti, L., and Maculan, N.
(2012). Distance geometry: theory, methods, and ap-
plications. Springer Science & Business Media.
Navigli, R. (2009). Word sense disambiguation: A survey.
ACM Computing Surveys (CSUR), 41(2):10.
Navigli, R. (2012). A quick tour of word sense disambigua-
tion, induction and related approaches. In SOFSEM
2012: Theory and practice of computer science, pages
115–129. Springer.
Navigli, R. and Ponzetto, S. P. (2012). Babelnet: The au-
tomatic construction, evaluation and application of a
wide-coverage multilingual semantic network. Artifi-
cial Intelligence, 193:217–250.
Owoputi, O., O’Connor, B., Dyer, C., Gimpel, K., Schnei-
der, N., and Smith, N. A. (2013). Improved part-
of-speech tagging for online conversational text with
word clusters. In HLT-NAACL, pages 380–390.
Rudin, W. (1964). Principles of mathematical analysis, vol-
ume 3. McGraw-Hill New York.
Schmitz, M., Bart, R., Soderland, S., Etzioni, O., et al.
(2012). Open language learning for information ex-
traction. In Proceedings of the 2012 Joint Conference
on Empirical Methods in Natural Language Process-
ing and Computational Natural Language Learning,
pages 523–534. Association for Computational Lin-
guistics.
Sch
¨
utze, H. (1998). Automatic word sense discrimination.
Computational linguistics, 24(1):97–123.
V
´
eronis, J. (2004). Hyperlex: lexical cartography for in-
formation retrieval. Computer Speech & Language,
18(3):223–252.
Vinh, N. X., Epps, J., and Bailey, J. (2009). Information
theoretic measures for clusterings comparison: is a
correction for chance necessary? In Proceedings of
the 26th Annual International Conference on Machine
Learning, pages 1073–1080. ACM.
Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics
of small-worldnetworks. nature, 393(6684):440–442.
Widdows, D. and Dorow, B. (2002). A graph model for
unsupervised lexical acquisition. In Proceedings of
the 19th international conference on Computational
linguistics-Volume 1, pages 1–7. Association for Com-
putational Linguistics.
Zhong, Z. and Ng, H. T. (2010). It makes sense: A
wide-coverage word sense disambiguation system for
free text. In Proceedings of the ACL 2010 System
Demonstrations, pages 78–83. Association for Com-
putational Linguistics.
KDIR 2015 - 7th International Conference on Knowledge Discovery and Information Retrieval
146