An Approach to Refine Translation Candidates for Emotion Estimation in Japanese-English Language

Kazuyuki Matsumoto; Minoru Yoshida; Kenji Kita; Fuji Ren

doi:10.5220/0005602200740083

An Approach to Refine Translation Candidates for Emotion Estimation in Japanese-English Language

Kazuyuki Matsumoto, Minoru Yoshida, Kenji Kita, Fuji Ren

2015

Abstract

Researches on emotion estimation from text mostly use machine learning method. Because machine learning requires a large amount of example corpora, how to acquire high quality training data has been discussed as one of its major problems. The existing language resources include emotion corpora; however, they are not available if the language is different. Constructing bilingual corpus manually is also financially difficult. We propose a method to convert a training data into different language using an existing Japanese-English parallel emotion corpus. With a bilingual dictionary, the translation candidates are extracted against every word of each sentence included in the corpus. Then the extracted translation candidates are narrowed down into a set of words that highly contribute to emotion estimation and we used the set of words as training data. As the result of the evaluation experiment using the training data created by our proposed method, the accuracy of emotion estimation increased up to 66.7% in Naive Bayes. 1 INTRODUCTION Recently, there have been many researches on emotion estimation from text in the field of sentiment analysis or opinion mining (Ren, 2009), (Ren and Quan, 2015), (Ren and Wu, 2013), (Quan and Ren, 2010), (Quan and Ren, 2014), (Ren and Matsumoto, 2015) and many of them adopted machine learning methods that used words as a feature. When the type of the target sentence for emotion estimation and the type of the sentence prepared as training data are different, as in the case of terminology in the problem of domain adaptation for document classification, the appearance tendency of the emotion words differs. This causes a problem in fluctuation of accuracy. On the other hand, when a word is used as a feature for emotion estimation, the sentence structure does not have to be considered. As a result, it is easy to apply the method to other languages. Only if we prepare a large number of corpora with annotation of emotion tags on each sentence, emotion would be easily estimated by using the machine learning method. In the machine learning method, because manual definition of a rule is not necessary, we can reduce costs to apply the method to other languages. However, just like the problem in the domain, depending on the

References

Balahur, A. and Turchi, M. (2012). Multilingual sentiment analysis using machine translation? In the 3rd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, pages 52-60.
Banerjee, S. (2005). Meteor : an automatic metric for mt evaluation with improved correlation with human judgments. In ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 65-72.
Brill, E. (1994). Some advances in transformation-based part of speech tagging. In National Conference on Artificial Intelligence, pages 722-727.
Buckley, B., S. G. A. J. and Singhal, A. A. (1995). Automatic query expansion using smart: Trec-3. In the Third Text REtrieval Conference(TREC-3), pages 500-238.
Echizen-ya, H. and Araki, K. (2007). Automatic evaluation of machine translation based on recursive acquisition of an intuitive common parts continuum. In the Eleventh Machine Translation Summit (MT SUMMIT XI), pages 151-158.
Enozawa, K. and Guruensutain, D. (1999). Kaiwa de oboeru kanjouhyougen Wa-Ei jiten -English merrygo-round series- (in Japanese). Natsumesha.
Hiejima, I. (1999). Tokyodo Shuppan.
Inui, T. and Yamamoto, M. (2011). Usage of different language translated data on classification of evaluation document (in japanese). pages 119-122.
Kang, X., Ren, F., andWu, Y. (2010). Bottom up: exploring word emotions for chinese sentence chief sentiment classification. In IEEE International Conference on Natural Language Processing and Knowledge Enginerring, pages 422426.
Lavie, A. and Agarwal, A. (2007). Meteor : an automatic metric for mt evaluation with high levels of correlation with human judgments. In ACL Second Workshop on Statistical Machine Translation, pages 228-231.
Matsumoto, K. and Ren, F. (2011). Estimation of word emotions based on part of speech and positional information. Computers in Human Behavior, 2011(27):1553-1564.
Minato, J., M. K. R. F. and Kuroiwa, S. (2007). Corpusbased analysis of japanese-english of emotional expressions. In IEEE International Conference on Natural Language Processing and Knowledge Enginerring, pages 413-418.
Minato, J., M. K. R. F. T. S. and Kuroiwa, S. (2008). Evaluation of emotion estimation methods based on statistic features of emotion tagged corpus. International Journal of Innovative Computing, Information and Control, 4(8):1931-1941.
Papineni, K., R. S. W. T. and Zhu, W. (2002). Bleu: a method for automatic evaluation of machine translation. In the 40th Annual Meeting on Association for Computational Linguistics(ACL'02), pages 311-318.
Quan, C. and Ren, F. (2010). A blog emotion corpus for emotional expression analysis in chinese. Computer Speech and Language, 24(1):726-749.
Quan, C. and Ren, F. (2011). Recognition of word emotion state in sentences. IEEJ Transactions on Electrical and Electronic Engineering, 6:34-41.
Quan, C. and Ren, F. (2014). Unsupervised product feature extraction for feature-oriented opinion determination. Information Sciences, 272(2014):16-28.
Ren, F. (2009). Affective information processing and recognizing human emotion. Electronic Notes in Theoretical Computer Science, 225:39-50.
Ren, F. and Matsumoto, K. (2015). Semi-automatic creation of youth slang corpus and its application to affective computing. IEEE Transactions on Affective Computing.
Ren, F., K. X. and Quan, C. (2015). Examining accumulated emotional traits in suicide blogs with an emotion topic model. IEEE Journal of Biomedical and Health Informatics.
Ren, F. andWu, Y. (2013). Predicting user-topic opinions in twitter with social and topical context. IEEE Transactions on Affective Computing, 4(4):412424.
Saiki, Y., T. H. and Okumura, M. (2008). Domain adaptaion in sentiment classification by instance weighting (in japanese). In IPSJ SIG Notes, volume 2008, pages 61-67.
Takamura, H., I. T. and Okumura, M. (2005). Extracting semantic orientations of words using spin model. In the 43rd Annual Meeting on Association for Computational Linguistics, pages 133-140.
Takamura, H., I. T. and Okumura, M. (2006). Latent variable models for semantic orientations of phrases (in japanese). Transactions of Information Processing Society of Japan, 47(11):3021-3031.
Wan, X. (2009). classification.
Wu, Y., K. K. and Matsumoto, K. (2014). Three predictions are better than one: sentence multi-emotion analysis from different perspectives. IEEJ Transactions on Electrical and Electronic Engineering (TEEE), 9(6):642-649.

Download

Paper Citation

in Harvard Style

Matsumoto K., Yoshida M., Kita K. and Ren F. (2015). An Approach to Refine Translation Candidates for Emotion Estimation in Japanese-English Language . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2015) ISBN 978-989-758-158-8, pages 74-83. DOI: 10.5220/0005602200740083

in Bibtex Style

@conference{keod15,
author={Kazuyuki Matsumoto and Minoru Yoshida and Kenji Kita and Fuji Ren},
title={An Approach to Refine Translation Candidates for Emotion Estimation in Japanese-English Language},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2015)},
year={2015},
pages={74-83},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005602200740083},
isbn={978-989-758-158-8},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, (IC3K 2015)
TI - An Approach to Refine Translation Candidates for Emotion Estimation in Japanese-English Language
SN - 978-989-758-158-8
AU - Matsumoto K.
AU - Yoshida M.
AU - Kita K.
AU - Ren F.
PY - 2015
SP - 74
EP - 83
DO - 10.5220/0005602200740083