Bernardy, J.-P. and Chatzikyriakidis, S. (2017). A type-
theoretical system for the fracas test suite: Grammati-
cal framework meets coq. In IWCS 2017-12th Interna-
tional Conference on Computational Semantics-Long
papers.
Bernardy, J.-P. and Chatzikyriakidis, S. (2018). A cor-
pus of precise natural textual entailment problems.
https://arxiv.org/abs/1812.05813.
Bernardy, J.-P. and Lappin, S. (2018). The influence of con-
text on sentence acceptability judgements.
Bordes, A., Boureau, Y.-L., and Weston, J. (2016). Learn-
ing end-to-end goal-oriented dialog. arXiv preprint
arXiv:1605.07683.
Bowman, S. R., Angeli, G., Potts, C., and Manning, C. D.
(2015). A large annotated corpus for learning natural
language inference. arXiv preprint arXiv:1508.05326.
Bowman, S. R., Potts, C., and Manning, C. D. (2014). Re-
cursive neural networks can learn logical semantics.
arXiv preprint arXiv:1406.1827.
Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., and Spe-
cia, L. (2017). Semeval-2017 task 1: Semantic tex-
tual similarity-multilingual and cross-lingual focused
evaluation. arXiv preprint arXiv:1708.00055.
Chatzikyriakidis, S., Cooper, R., Dobnik, S., and Larsson,
S. (2017a). An overview of natural language inference
data collection: The way forward? In Proceedings
of the Computing Natural Language Inference Work-
shop.
Chatzikyriakidis, S., Lafourcade, M., Ramadier, L., and
Zarrouk, M. (2017b). Type theories and lexical net-
works: Using serious games as the basis for multi-
sorted typed systems. Journal of Language Modelling,
5(2):229–272.
Chen, Q., Zhu, X., Ling, Z., Inkpen, D., and Wei, S.
(2017a). Natural language inference with external
knowledge. CoRR, abs/1711.04289.
Chen, Q., Zhu, X., Ling, Z.-H., Wei, S., Jiang, H., and
Inkpen, D. (2017b). Enhanced lstm for natural lan-
guage inference. In Proceedings of the 55th Annual
Meeting of the Association for Computational Lin-
guistics (Volume 1: Long Papers), pages 1657–1668.
Association for Computational Linguistics.
Chen, Z., Zhang, H., Zhang, X., and Zhao, L. Quora ques-
tion pairs.
Conneau, A., Rinott, R., Lample, G., Williams, A., Bow-
man, S. R., Schwenk, H., and Stoyanov, V. (2018).
Xnli: Evaluating cross-lingual sentence representa-
tions. In Proceedings of the 2018 Conference on Em-
pirical Methods in Natural Language Processing. As-
sociation for Computational Linguistics.
Cooper, R., Crouch, D., Van Eijck, J., Fox, C., Van Gen-
abith, J., Jaspars, J., Kamp, H., Milward, D., Pinkal,
M., Poesio, M., et al. (1996). Using the framework.
Technical report.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2018). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. arXiv preprint
arXiv:1810.04805.
Evans, R., Saxton, D., Amos, D., Kohli, P., and Grefen-
stette, E. (2018). Can neural networks understand log-
ical entailment? arXiv preprint arXiv:1802.08535.
Ganitkevitch, E., Pavlick, P., Rastogi, J., Van Durme, B.,
and Callison-Burch, C. (2015). PPDB 2.0: Better
paraphrase ranking, fine-grained entailment relations,
word embeddings, and style classification. In Pro-
ceedings of the 53rd Annual Meeting of the Associ-
ation for Computational Linguistics and the 7th In-
ternational Joint Conference on Natural Language
Processing (Short Papers), pages 425–430, Beijing,
China. Association for Computational Linguistics.
Glockner, M., Shwartz, V., and Goldberg, Y. (2018). Break-
ing nli systems with sentences that require simple lex-
ical inferences. arXiv preprint arXiv:1805.02266.
Gururangan, S., Swayamdipta, S., Levy, O., Schwartz, R.,
Bowman, S. R., and Smith, N. A. (2018). Annota-
tion artifacts in natural language inference data. arXiv
preprint arXiv:1803.02324.
Huang, G. and Liu, Z. (2017). Densely connected convolu-
tional networks.
Jurczyk, T., Zhai, M., and Choi, J. D. (2016). Selqa: A
new benchmark for selection-based question answer-
ing. In Tools with Artificial Intelligence (ICTAI), 2016
IEEE 28th International Conference on, pages 820–
827. IEEE.
Kim, S., Hong, J.-H., Kang, I., and Kwak, N. (2018). Se-
mantic sentence matching with densely-connected re-
current and co-attentive information. arXiv preprint
arXiv:1805.11360.
Lafourcade, M. and Joubert, A. (2008). Jeuxdemots: un
prototype ludique pour l’
´
emergence de relations en-
tre termes. In JADT’08: Journ
´
ees internationales
d’Analyse statistiques des Donn
´
ees Textuelles, pages
657–666.
Lafourcade, M., Joubert, A., and Le Brun, N. (2015).
Games with a Purpose (GWAPS). John Wiley & Sons.
Lake, B. M. and Baroni, M. (2017). Still not system-
atic after all these years: On the compositional skills
of sequence-to-sequence recurrent networks. arXiv
preprint arXiv:1711.00350.
Linzen, T., Dupoux, E., and Golberg, Y. (2016). Assessing
the ability of LSTMs to learn syntax-sensitive depen-
dencies. Transactions of the Association of Computa-
tional Linguistics, 4:521–535.
Marelli, M., Menini, S., Baroni, M., Bentivogli, L.,
Bernardi, R., and Zamparelli, R. (2014). A SICK cure
for the evaluation of compositional distributional se-
mantic models. In LREC, pages 216–223.
Miller, G. A. (1995). Wordnet: a lexical database for en-
glish. Communications of the ACM, 38(11):39–41.
Shalyminov, I., Eshghi, A., and Lemon, O. (2017). Chal-
lenging neural dialogue models with natural data:
Memory networks fail on incremental phenomena.
arXiv preprint arXiv:1709.07840.
Talman, A. and Chatzikyriakidis, S. (2018). Testing the
generalization power of neural network models across
nli benchmarks. arXiv preprint arXiv:1810.09774.
Talman, A., Yli-Jyr
¨
a, A., and Tiedemann, J. (2018). Natural
language inference with hierarchical bilstm max pool-
ing architecture. arXiv preprint arXiv:1808.08762.
Tay, Y., Tuan, L. A., and Hui, S. C. (2017). A
compare-propagate architecture with alignment fac-
NLPinAI 2019 - Special Session on Natural Language Processing in Artificial Intelligence
930