according to the Law of Conformity, those with the
least semantic evolution. One could thus wonder
whether the words for which the model correctly
predicts the semantic are not simply those which
display little or no semantic change. The examples
given in section 4.2 show that this is at least not
always the case, but a more systematic investigation
of individual cases is in order to get a clear picture.
Another way to answer this question would be to
explore more finely the effect that both polysemy
and word frequency may have on our results,
especially on the word representation part of our
model. These two factors have been shown to play
an important role in the semantic change, and their
effects need to be studied and formalized more
explicitly. Exploring more advanced and semantic-
oriented word embedding techniques, referred to as
sense embeddings, such as S
ENSEMBED (Iacobacci
et al., 2015), could help make the model less
sensitive to those factors.
5 CONCLUSIONS
In conclusion, we describe in this paper a method
that accurately models the evolution of language and
that can to some extent predict future evolution,
based on past observations. Although our
experiment is still in its preliminary stages, we
believe it can provide linguists with a refreshing
look on linguistic evolution. Our method makes it
possible to observe large-scale evolution in general
and semantic change in particular. It thus nicely
complements existing methods and reinforces a
falsifiability approach to linguistics. Based on the
current study, we have identified several future
research directions from a technical point of view.
The RNN model that we propose to use is rather
standard and simplistic compared to the complexity
of semantic change. We therefore intend to explore
deeper networks and to put more time and effort in
the fine tuning process of its hyper-parameters.
ACKNOWLEDGEMENTS
This work is supported by the project 2016-147
ANR OPLADYN TAP-DD2016. Our Thanks go to
the anonymous reviewers for their constructive
comments.
REFERENCES
Bailey, C.-J. N. (1973) ‘Variation and linguistic theory.’
ERIC.
Bengio, Y. (1999) ‘Markovian models for sequential data’,
Neural computing surveys, 2(199), pp. 129–162.
Blei, D. M., Ng, A. Y. and Jordan, M. I. (2003) ‘Latent
dirichlet allocation’, Journal of machine Learning
research, 3(Jan), pp. 993–1022.
Bollack, L. (1903) La langue franc
̜
aise en l’an 2003...
Bureau de La Revue.
Deerwester, S. et al. (1990) ‘Indexing by latent semantic
analysis’, Journal of the American society for
information science. Wiley Online Library, 41(6), pp.
391–407.
Dictionary, O. E. (1989) ‘Oxford english dictionary’,
Simpson, JA & Weiner, ESC.
Dubossarsky, H. et al. (2015) ‘A bottom up approach to
category mapping and meaning change.’, in
NetWordS, pp. 66–70.
Hamilton, W. L., Leskovec, J. and Jurafsky, D. (2016a)
‘Cultural shift or linguistic drift? comparing two
computational measures of semantic change’, in
Proceedings of the Conference on Empirical Methods
in Natural Language Processing. Conference on
Empirical Methods in Natural Language Processing,
p. 2116.
Hamilton, W. L., Leskovec, J. and Jurafsky, D. (2016b)
‘Diachronic word embeddings reveal statistical laws of
semantic change’, in Proceedings of the 54th Annual
Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), pp. 1489-1501.
Hochreiter, S. and Schmidhuber, J. (1997) ‘Long short-
term memory’, Neural computation. MIT Press, 9(8),
pp. 1735–1780.
Iacobacci, I., Pilehvar, M. T. and Navigli, R. (2015)
‘Sensembed: Learning sense embeddings for word and
relational similarity’, in Proceedings of the 53rd
Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference
on Natural Language Processing (Volume 1: Long
Papers), pp. 95–105.
Kim, Y. et al. (2014) ‘Temporal analysis of language
through neural language models’, in Proceedings of
the ACL 2014 Workshop on Language Technologies
and Computational Social Science, pp. 61-65.
Kroch, A. S. (1989) ‘Reflexes of grammar in patterns of
language change’, Language variation and change.
Cambridge University Press, 1(3), pp. 199–244.
Kutuzov, A., Velldal, E. and Øvrelid, L. (2017) ‘Temporal
dynamics of semantic relations in word embeddings:
an application to predicting armed conflict
participants’, in Proceedings of the 2017 Conference
on Empirical Methods in Natural Language
Processing, pp. 1824-1829. 2017.
Lafferty, J., McCallum, A. and Pereira, F. C. N. (2001)
‘Conditional random fields: Probabilistic models for
segmenting and labeling sequence data’.
NLPinAI 2019 - Special Session on Natural Language Processing in Artificial Intelligence