of this is the assumption that the proposed approach
uses more effective data pre-processing methods and
machine learning algorithms for off-talk detection.
Another possible reason is a conceptual contradiction
in the baseline approach; the fact that it uses prosodic
features for off-talk detection means that users are
supposed to change their normal manner of speech
once they start talking to a computer. Such an
approach can work now, since modern dialogue
systems are still far from perfection, and users have
to adapt their behaviour talking to them. However, it
does not correspond to the main direction of
automatic dialogue system development – to make
the interaction between a user and a system as natural
as possible.
Any additional data processing (speech
recognition, text pre-processing, etc.) causes an
information loss. Deep learning neural networks
possess some features, which could improve
classification effectiveness: due to their ability to
work with entities of different abstraction levels, they
do not require additional data processing and are able
to make the system more effective and flexible.
Moreover, the works on off-talk detection (Shriberg
et al., 2012) and (Batliner et al., 2006) state that using
more than one group of features significantly
improves classification effectiveness. The choice of
relevant features can also be delegated to a system
based on deep learning neural networks. Therefore, as
a future direction, we propose the research of a deep
learning neural network-based approach to off-talk
detection.
REFERENCES
Sebastiani, F. 2002. Machine learning in automated text
categorization. ACM computing surveys (CSUR),
34(1):1-47.
Salton, G. and Buckley, C. 1988. Term-weighting
approaches in automatic text retrieval. Information
processing & management, 24(5):513-523.
Debole, F. and Sebastiani, F. 2004. Supervised term
weighting for automated text categorization. Text
mining and its applications:81-97. Springer Berlin
Heidelberg.
Soucy P. and Mineau G. W. 2005. Beyond TFIDF
Weighting for Text Categorization in the Vector Space
Model. Proceedings of the 19th International Joint
Conference on Artificial Intelligence (IJCAI
2005):1130-1135.
Xu, H. and Li, C. 2007. A Novel term weighting scheme for
automated text Categorization. Intelligent Systems
Design and Applications, 2007. ISDA 2007. Seventh
International Conference on:759-764. IEEE.
Lan, M., Tan, C. L., Su, J., and Lu, Y. 2009. Supervised and
traditional term weighting methods for automatic text
categorization. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 31(4):721-735.
Ko, Y. 2012. A study of term weighting schemes using class
information for text classification. Proceedings of the
35th international ACM SIGIR conference on Research
and development in information retrieval:1029-1030.
ACM.
Gasanova, T., Sergienko, R., Akhmedova, S., Semenkin,
E., and Minker, W. 2014. Opinion Mining and Topic
Categorization with Novel Term Weighting.
Proceedings of the 5th Workshop on Computational
Approaches to Subjectivity, Sentiment and Social
Media Analysis, Association for Computational
Linguistics, Baltimore, Maryland, USA, 84–89.
Fan, R. E., Chang, K. W., Hsieh C. J., Wang X. R., Lin C.
J. 2008. Liblinear: A library for large linear
classification. The Journal of Machine Learning
Research, 9, 1871–1874.
Yang, Y., and Pedersen, J. O. 1997. A comparative study
on feature selection in text categorization. ICML, vol.
9:412-420.
Batliner, A., Hacker, C., and Noth, E. 2006. To Talk or not
to Talk with a Computer: On-Talk vs. Off-Talk. How
People Talk to Computers, Robots, and Other Artificial
Communication Partners, 79-100.
Batliner, A., Fischer, K., Huber, R., Spilker, J., Noth, E.
2003. How to Find Trouble in Communication. Speech
Communication, 40, 117–143.
Batliner, A., Nutt, M., Warnke, V., Noth, E., Buckow, J.,
Huber, R., Niemann, H. 1999. Automatic Annotation
and Classification of Phrase Accents in Spontaneous
Speech. Proc. of Eurospeech99, 519–522.
Zhou, Y., Li, Y., and Xia, S. 2009. An improved KNN text
classification algorithm based on clustering. Journal of
computers, 4(3), 230-237.
Sergienko, R., Muhammad, S., and Minker, W. 2016. A
comparative study of text preprocessing approaches for
topic detection of user utterances. In Proceedings of the
10th edition of the Language Resources and Evaluation
Conference (LREC 2016).
Baeza-Yates, R; Ribeiro-Neto, B. 1999. Modern
Information Retrieval. New York, NY: ACM Press,
Addison-Wesley, 75.
Shriberg, E., Stolcke, A., Hakkani-Tur, D., Heck, L. 2012.
Learning When to Listen: Detecting System-Addressed
Speech in Human-Human-Computer Dialog.
Proceedings of Interspeech 2012, 334-337.
Shafait, F., Reif, M., Kofler, C., and Breuel, T. M. 2010.
Pattern recognition engineering. In: RapidMiner
Community Meeting and Conference, Citeseer, vol 9.
An Approach to Off-talk Detection based on Text Classification within an Automatic Spoken Dialogue System
293