lems. In Empirical Methods in Natural Language Pro-
cessing (EMNLP), pages 805–814, Copenhagen, Den-
mark.
Huang, D., Shi, S., Lin, C.-Y., Yin, J., and Ma, W.-Y.
(2016). How well do computers solve math word
problems? large-scale dataset construction and evalu-
ation. In Annual Meeting of the Association for Com-
putational Linguistics (ACL), pages 887–896, Berlin,
Germany.
Huang, D., Yao, J.-G., Lin, C.-Y., Zhou, Q., and Yin, J.
(2018). Using intermediate representations to solve
math word problems. In Annual Meeting of the As-
sociation for Computational Linguistics (ACL), pages
419–428, Melbourne, Australia.
Kingma, D. P. and Ba, J. (2014). Adam: A method for
stochastic optimization. CoRR, abs/1412.6980. http:
//arxiv.org/abs/1412.6980.
Kushman, N., Artzi, Y., Zettlemoyer, L., and Barzilay,
R. (2014). Learning to automatically solve algebra
word problems. In Annual Meeting of the Association
for Computational Linguistics (ACL), pages 271–281,
Baltimore, Maryland.
Leszczynski, M. and Moreira, J. (2016). Machine solver for
physics word problems. In Neural Information Pro-
cessing Systems (NIPS) Intuitive Physics Workshop,
Barcelona, Spain.
Ling, W., Yogatama, D., Dyer, C., and Blunsom, P. (2017).
Program induction by rationale generation: Learn-
ing to solve and explain algebraic word problems.
In Annual Meeting of the Association for Computa-
tional Linguistics (ACL), pages 158–167, Vancouver,
Canada.
Matsuzaki, T., Ito, T., Iwane, H., Anai, H., and H. Arai,
N. (2017). Semantic parsing of pre-university math
problems. In Annual Meeting of the Association for
Computational Linguistics (ACL), pages 2131–2141,
Vancouver, Canada.
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002).
BLEU: a method for automatic evaluation of machine
translation. In Annual Meeting of the Association
for Computational Linguistics (ACL), pages 311–318,
Philadelphia, Pennsylvania.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E.,
DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and
Lerer, A. (2017). Automatic differentiation in py-
torch. In Workshop on Autodiff, Advances in Neu-
ral Information Processing Systems (NIPS) ), Long
Beach, California.
Pennington, J., Socher, R., and Manning, C. (2014). Glove:
Global vectors for word representation. In Empirical
Methods in Natural Language Processing (EMNLP),
pages 1532–1543, Doha, Qatar.
R Core Team (2013). R: A Language and Environment
for Statistical Computing. R Foundation for Sta-
tistical Computing, Vienna, Austria. http://www.R-
project.org/.
Roy, S. and Roth, D. (2015). Solving general arithmetic
word problems. In Empirical Methods in Natural Lan-
guage Processing (EMNLP), pages 1743–1752, Lis-
bon, Portugal.
Rush, A. (2018). The annotated transformer. In Proceedings
of Workshop for NLP Open Source Software (NLP-
OSS), pages 52–60, Melbourne, Australia.
See, A., Liu, P. J., and Manning, C. D. (2017). Get to
the point: Summarization with pointer-generator net-
works. In Annual Meeting of the Association for Com-
putational Linguistics (ACL), pages 1073–1083, Van-
couver, Canada.
Shi, S., Wang, Y., Lin, C.-Y., Liu, X., and Rui, Y. (2015).
Automatically solving number word problems by se-
mantic parsing and reasoning. In Empirical Meth-
ods in Natural Language Processing (EMNLP), pages
1132–1142, Lisbon, Portugal.
Suppes, P., B
¨
ottner, M., and Liang, L. (1998). Machine
learning of physics word problems: A preliminary re-
port. In Computing Natural Language, pages 141–
154. Stanford University, California, USA.
Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Se-
quence to sequence learning with neural networks. In
Advances in Neural Information Processing Systems
(NIPS), pages 3104–3112. Curran Associates, Inc.,
Red Hook, NY.
Tieleman, T. and Hinton, G. E. (2012). Lecture 6.5-
RMSProp: Divide the gradient by a running average
of its recent magnitude. Technical report, COURS-
ERA:Neural Network for Machine Learning, 4, 26-
30.
van der Maaten, L. and Hinton, G. E. (2008). Visualizing
data using t-sne. Journal of Machine Learning Re-
search (JMLR), 9(Nov):2579–2605.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L. u., and Polosukhin, I.
(2017). Attention is all you need. In Advances in
Neural Information Processing Systems (NIPS), pages
5998–6008. Curran Associates, Inc., Red Hook, NY.
Vilenius-Tuohimaa, P. M., Aunola, K., and Nurmi, J.
(2008). The association between mathematical word
problems and reading comprehension. Educational
Psychology, 28(4):409–426.
Vinyals, O., Fortunato, M., and Jaitly, N. (2015). Pointer
networks. In Advances in Neural Information Pro-
cessing Systems (NIPS), pages 2692–2700. Curran
Associates, Inc.
Wang, L., Wang, Y., Cai, D., Zhang, D., and Liu, X. (2018).
Translating a math word problem to a expression tree.
In Empirical Methods in Natural Language Process-
ing (EMNLP), pages 1064–1069, Brussels, Belgium.
Wang, Y., Liu, X., and Shi, S. (2017). Deep neural solver for
math word problems. In Empirical Methods in Natu-
ral Language Processing (EMNLP), pages 845–854,
Copenhagen, Denmark.
Wong, R. (2018). Solving math word problems. Tech-
nical report, Stanford University, Palo Alto, CA.
https://web.stanford.edu/class/archive/cs/cs224n/
cs224n.1184/reports/6866023.pdf.
Zhou, L., Dai, S., and Chen, L. (2015). Learn to solve alge-
bra word problems using quadratic programming. In
Empirical Methods in Natural Language Processing
(EMNLP), pages 817–822, Lisbon, Portugal.
Zhu, X., Vondrick, C., Fowlkes, C. C., and Ramanan, D.
(2015). Do we need more training data? CoRR,
abs/1503.01508. http://arxiv.org/abs/1503.01508.
NCTA 2019 - 11th International Conference on Neural Computation Theory and Applications
472