of a target word based on its context. This idea was
inspired by the fact that most sentences use a single
sense per word and human can determine the mean-
ing of a word in a given context by referring to the
different meanings already-known of this word. For
this reason, we construct a new sense of a word using
its previously-learned sense by considering its previ-
ous context. Our overall process allows us to achieve
encouraging results.
4 CONCLUSIONS
In this paper, we present a new deep model, named
Bi-RAE (Bi-Recursive Auto-Encoders), to learn word
meaning embedding from scratch. This model aims
to construct a dense informative representation of
word meaning using its sentences as context. More
precisely, it treats each sentence containing the tar-
get word as two sub-contexts (left and right contexts
around the target). Our model is based on the idea
of learning dynamically an evolved semantic embed-
ding of a word relying on the words contained in their
sentential context and its latest semantic representa-
tion. Thus, it was possible to create semantic embed-
dings of words that captures as accurate as possible
the meaning of the word conveyed in their contexts.
We released experiments on a very challenging task
in NLP; the semantic similarity task. Experimental
results proved the effectiveness of our unsupervised
model compared to well-known methods modeling
word semantic embeddings using either in single or
multi prototypes. In our future work, we we would
couple our proposed model with an attention mecha-
nism to further improve its learned embeddings.
REFERENCES
Chen, X., Liu, Z., , and Sun, M. (2014). A unified model
for word sense representation and disambiguation. In
Proceeding of the Conference on Empirical Methods
in Natural Language Processing, pages 1025–1035.
Collobert, R., Weston, J., Bottou, L., Karlen, M.,
Kavukcuoglu, K., and Kuksa, P. P. (2011). Natural
language processing (almost) from scratch. Journal
of Machine Learning Research, 12:2493–2537.
Fei, T., Hanjun, D., Jiang, B., Bin, G., Rui, Z., Enhong,
C., and Tie-Yan, L. (2014). A probabilistic model for
learning multi-prototype word embeddings. In Pro-
ceeding of the International Conference on Computa-
tional Linguistics, pages 151–160.
Guo, F., Iyyer, M., and Boyd-Graber, J. L. (2019). Inducing
and embedding senses with scaled gumbel softmax.
ArXiv.
Harris, Z. (1954). Distributional structure. Word,
10(23):146–162.
Huang, E. H., Socher, R., Manning, C. D., and Ng, A. Y.
(2012). Improving word representations via global
context and multiple word prototypes. In The 50th
Annual Meeting of the Association for Computational
Linguistics, Proceedings of the Conference, July 8-14,
2012, Jeju Island, Korea - Volume 1: Long Papers,
pages 873–882.
Jiwei, L. and Dan, J. (2015). Do multi-sense embeddings
improve natural language understanding? In Empir-
ical Methods in Natural Language Processing, pages
1722–1732.
Kingma, D. P. and Ba, J. (2015). Adam: a method for
stochastic optimization. In Proceeding of the Interna-
tional Conference on Learning Representations, pages
1–13.
Liu, Y., Liu, Z., Chua, T., and Sun, M. (2015). Topi-
cal word embeddings. In Proceedings of the Twenty-
Ninth Association for the Advancement of Artificial
Intelligence Conference, January 25-30, 2015, Austin,
Texas, USA., pages 2418–2424.
Mikolov, T., tau Yih, S. W., and Zweig, G. (2013). Lin-
guistic regularities in continous space word represen-
tations. pages 746–751. Proceeding of the Annual
Conference of the North American Chapter of the As-
sociation for Computational Linguistics.
Neelakantan, A., Shankar, J., Passos, A., and McCallum, A.
(2015). Efficient non-parametric estimation of multi-
ple embeddings per word in vector space. In Proceed-
ings of the 2014 Conference on Empirical Methods in
Natural Language Processing, EMNLP 2014, Octo-
ber 25-29, 2014, Doha, Qatar, A meeting of SIGDAT,
a Special Interest Group of the Association for Com-
putational Linguistics, pages 1059–1069.
Nguyen, D. Q., Nguyen, D. Q., Modi, A., Thater, S., and
Pinkal, M. (2017). A mixture model for learning
multi-sense word embeddings. In *SEM, pages 121–
127. Association for Computational Linguistics.
Pengfei, L., Xipeng, Q., and Xuanjing, H. (2015). Learning
context-sensitive word embeddings with neural tensor
skip-gram model. In Proceeding of the International
Joint Conference on Artificial Intelligence.
Reisinger, J. and Mooney, R. J. (2010). Multi-prototype
vector-space models of word meaning. In Human
Language Technologies: The 2010 Annual Confer-
ence of the North American Chapter of the Associ-
ation for Computational Linguistics, pages 109–117.
Association for Computational Linguistics.
Sebastian, R. (2016). An overview of gradient descent op-
timization algorithms. Computing Research Reposi-
tory, abs/1609.04747.
A Bi-recursive Auto-encoders for Learning Semantic Word Embedding
533