a character-level skip-gram model or a Continuous
Bag-of-Characters model inspired by the Continuous
Bag-of-Words (CBoW) model.
ACKNOWLEDGEMENTS
This work was supported by JSPS KAKENHI Grant
Numbers 26330081, 26870201, 16K12411. We use
the Rakuten dataset which is provided by the National
Institute of Informatics (NII) according to the con-
tract between NII and Rakuten, Inc. We would like
to thank NII and Rakuten, Inc.
We use Python library Keras(Chollet, 2015) for
building the neural networks and use Python library
Theano(Theano Development Team, 2016) as back-
end engine of Keras. We would also thank the devel-
opers of Theano and Keras.
REFERENCES
Agrawal, P., Girshick, R., and Malik, J. (2014). Analyzing
the Performance of Multilayer Neural Networks for
Object Recognition. In the Proceedings of the 13th
European Conference on Computer Vision (ECCV),
ECCV ’14, pages 329–344.
Bengio, Y., Boulanger-Lewandowski, N., and Pascanu, R.
(2013). Advances in Optimizing Recurrent Networks.
In the Proceedings of the 2013 IEEE International
Conference on Acoustics, Speech and Signal Process-
ing (ICASSP), ICASSP ’13.
Chollet, F. (2015). Keras.
Del Corso, G. M., Gull
´
ı, A., and Romani, F. (2005). Rank-
ing a Stream of News. In the Proceedings of the
14th International Conference on World Wide Web
(WWW), WWW ’05, pages 97–106.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). ImageNet: A Large-Scale Hierarchical
Image Database. In the Proceedings of the 2009 IEEE
Conference on Computer Vision and Pattern Recogni-
tion (CVPR), CVPR ’09.
dos Santos, C. and Gatti, M. (2014). Deep Convolutional
Neural Networks for Sentiment Analysis of Short
Texts. In the Proceedings of the 25th International
Conference on Computational Linguistics (COLING),
COLING ’14, pages 69–78.
dos Santos, C. N., Xiang, B., and Zhou, B. (2015). Classi-
fying Relations by Ranking with Convolutional Neu-
ral Networks. In the Proceedings of the 53rd Annual
Meeting of the Association for Computational Lin-
guistics (ACL), ACL ’15, pages 626–634.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014).
Rich Feature Hierarchies for Accurate Object Detec-
tion and Semantic Segmentation. In the Proceedings
of the 2014 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), CVPR ’14.
Glorot, X. and Bengio, Y. (2010). Understanding the Diffi-
culty of Training Deep Feedforward Neural Networks.
In the Proceedings of the 13rd International Con-
ference on Artificial Intelligence and Statistics (AIS-
TATS), AISTATS ’10.
Glorot, X., Bordes, A., and Bengio, Y. (2011). Domain
Adaptation for Large-scale Sentiment Classification:
A Deep Learning Approach. In the Proceedings of the
28th International Conference on Machine Learning
(ICML), ICML ’11.
Gulli, A. (2005). The Anatomy of a News Search Engine. In
International Conference on World Wide Web (WWW)
Special interest tracks and posters, WWW ’05, pages
880–881.
Kim, Y. (2014). Convolutional Neural Networks for
Sentence Classification. In the Proceedings of the
2014 Conference on Empirical Methods in Natural
Language Processing (EMNLP), EMNLP ’14, pages
1746–1751.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet Classification with Deep Convolutional Neu-
ral Networks. In the Proceedings of the 26th Annual
Conference on Neural Information Processing Sys-
tems (NIPS), NIPS ’12, pages 1097–1105.
Kudo, T., Yamamoto, K., and Matsumoto, Y. (2004). Ap-
plying Conditional Random Fields to Japanese Mor-
phological Analysis. In the Proceedings of the 2004
Conference on Empirical Methods in Natural Lan-
guage Processing (EMNLP), EMNLP ’04, pages 230–
237.
Maas, A. L., Daly, R. E., Pham, P. T., Huang, D., Ng,
A. Y., and Potts, C. (2011). Learning Word Vectors
for Sentiment Analysis. In the Proceedings of the
49th Annual Meeting of the Association for Compu-
tational Linguistics: Human Language Technologies
(ACL HLT), ACL-HLT ’11, pages 142–150.
McAuley, J., Pandey, R., and Leskovec, J. (2015a). In-
ferring Networks of Substitutable and Complemen-
tary Products. In the Proceedings of the 21th ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining (KDD), KDD ’13, pages
785–794.
McAuley, J., Targett, C., Shi, Q., and van den Hengel, A.
(2015b). Image-Based Recommendations on Styles
and Substitutes. In the Proceedings of the 38th In-
ternational ACM SIGIR Conference on Research and
Development in Information Retrieval (SIGIR), SIGIR
’13, pages 43–52.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and
Dean, J. (2013a). Distributed Representations of
Words and Phrases and their Compositionality. In the
Proceedings of the 27th Annual Conference on Neu-
ral Information Processing Systems (NIPS), NIPS ’13,
pages 3111–3119.
Mikolov, T., Yih, W.-t., and Zweig, G. (2013b). Lin-
guistic Regularities in Continuous Space Word Rep-
resentations. In the Proceedings of the 2013 Confer-
ence of the North American Chapter of the Associa-
tion for Computational Linguistics: Human Language
Technologies (NAACL HLT), NAACL-HLT ’13, pages
746–751.
Japanese Text Classification by Character-level Deep ConvNets and Transfer Learning
183