Deep Level Situation Understanding and its Application to Casual Communication between Robots and Humans

Yongkang Tang, Fangyan Dong, Mina Yuhki, Yoichi Yamazaki, Takanori Shibata, Kaoru Hirota

Abstract

The concept of Deep Level Situation Understanding is proposed to realize human-like natural communication among agents (e.g., humans and robots/machines), where it consists of surface level understanding (such as gesture/posture recognition, facial expression recognition, and speech/voice recognition), emotion understanding, intention understanding, and atmosphere understanding by applying customised knowledge of each agent and by taking considerations to careful attentions. It aims to not impose burden on humans in human-machine communication, to realize harmonious communication by excluding unnecessary troubles or misunderstandings among agents, and finally to create a peaceful, happy, and prosperous humans-robots society. A scenario is established to demonstrate several communication activities between a businessman and a secretary-robot/a human-boss/a waitress-robot/a human-partner/a therapy-robot (PARO) in one day.

References

  1. Amma, C., Gehrig, D., & Schultz, T., 2010. Airwriting recognition using wearable motion sensors. In Proceedings of the 1st Augmented Human International Conference. ACM.
  2. Chen, L., Liu, Z. et al., 2012. Multi-Robot Behavior Adaptation to Communication Atmosphere in Humans-Robots Interaction Using Fuzzy Production Rule Based Friend-Q learning, International Symposium on Soft Computing.
  3. Hommel, S., & Handmann, U., 2011. AAM based continuous facial expression recognition for face image sequences. In Computational Intelligence and Informatics (CINTI), 2011 IEEE 12th International Symposium, 189-194.
  4. Jung, S., Lee, C., Kim, K., Jeong, M., & Lee, G. G., 2009. Data-driven user simulation for automated evaluation of spoken dialog systems. Computer Speech & Language, 23(4), 479-509.
  5. Levin, E., Pieraccini, R., & Eckert, W., 2000. A stochastic model of human-machine interaction for learning dialog strategies. Speech and Audio Processing, IEEE Transactions, 8(1), 11-23.
  6. Liu, C., Conn, K., Sarkar, N., & Stone, W., 2008. Online affect detection and robot behavior adaptation for intervention of children with autism. Robotics, IEEE Transactions on Robotics, 24(4), 883-896.
  7. Liu, Z. T., Dong, F. Y., Hirota, K., Wu, M., Li, D. Y., & Yamazaki, Y., 2011. Emotional states based 3-D Fuzzy Atmosfield for casual communication between humans and robots. In Fuzzy Systems (FUZZ), 2011 IEEE International Conference, 777-782.
  8. Paleari, M., Huet, B., & Chellali, R., 2010. Towards multimodal emotion recognition: a new approach. In Proceedings of the ACM International Conference on Image and Video Retrieval, 174-181.
  9. Rutkowski, T. M., Kakusho, K., Kryssanov, V., & Minoh, M., 2004. Evaluation of the communication atmosphere. In Knowledge-based intelligent information and engineering systems, 364-370.
  10. Sasajima, M., Yano, T., & Kono, Y., 1999. EUROPA: A generic framework for developing spoken dialogue systems. In Proc. of EUROSPEECH'99, 1163-1166
  11. Shan, C., Tan, T., & Wei, Y., 2007. Real-time hand tracking using a mean shift embedded particle filter. Pattern Recognition, 40(7), 1958-1970.
  12. Shimada, K., Iwashita, K., & Endo, T., 2007. A case study of comparison of several methods for corpus-based speech intention identification. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics, 255-262.
  13. Takagi, T., Nishi, T., & Yasuda, D., 2000. Computer assisted driving support based on intention reasoning. In Industrial Electronics Society, 2000. IECON 2000. 26th Annual Confjerence of the IEEE, (1), 505-508.
  14. Tang, Y., Hai, V. et al., 2011. Multimodal Gesture Recognition for Mascot Robot System Based on Choquet Integral Using Camera and 3D Accelerometers Fusion. Journal of Advanced Computational Intelligence and Intelligent Informatics. (15), 563-572
  15. Vinciarelli, A., Pantic, M., Bourlard, H., & Pentland, A., 2008. Social signal processing: state-of-the-art and future perspectives of an emerging domain. In Proceedings of the 16th ACM international conference on Multimedia, 1061-1070.
  16. Williams, J. D., Poupart, P., & Young, S., 2008. Partially observable Markov decision processes with continuous observations for dialogue management. In Recent Trends in Discourse and Dialogue, 191-217.
  17. Yamazaki, Y., Vu, H et al., 2010. Gesture recognition using combination of acceleration sensor and images for casual communication between robots and humans. In Evolutionary Computation, 1-7.
Download


Paper Citation


in Harvard Style

Tang Y., Dong F., Yuhki M., Yamazaki Y., Shibata T. and Hirota K. (2013). Deep Level Situation Understanding and its Application to Casual Communication between Robots and Humans . In Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-989-8565-71-6, pages 292-299. DOI: 10.5220/0004480402920299


in Bibtex Style

@conference{icinco13,
author={Yongkang Tang and Fangyan Dong and Mina Yuhki and Yoichi Yamazaki and Takanori Shibata and Kaoru Hirota},
title={Deep Level Situation Understanding and its Application to Casual Communication between Robots and Humans},
booktitle={Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2013},
pages={292-299},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004480402920299},
isbn={978-989-8565-71-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - Deep Level Situation Understanding and its Application to Casual Communication between Robots and Humans
SN - 978-989-8565-71-6
AU - Tang Y.
AU - Dong F.
AU - Yuhki M.
AU - Yamazaki Y.
AU - Shibata T.
AU - Hirota K.
PY - 2013
SP - 292
EP - 299
DO - 10.5220/0004480402920299