Integrating a Model for Visual Attention into a System for Natural Language Parsing

Christopher Baumgärtner, Wolfgang Menzel

Abstract

We present a system for integrating knowledge about complex visual scenes into the process of natural language comprehension. The implemented system is able to choose a scene of reference for a natural language sentence from a large set of scene descriptions. This scene is then used to influence the analysis of a sentence generated by a broad coverage language parser. In addition, objects and actions referred to by the sentence are visualized by a saliency map which is derived from the bi-directional influence of top down and bottom-up information on a model of visual attention highlighting the regions with the highest probability of attracting the attention of an observer.

References

  1. Beuck, N., Köhn, A., Menzel. W.: Incremental parsing and the evaluation of partial dependency analyses In Proceedings of the 1st International Conference on Dependency Linguistics, DepLing-2011 (2011) 290-299
  2. Eberhard, K. M., Spivey-Knowlton, M. J., Sedivy, J. C:, Tanenhaus, M. K.: Eye Movements as a Window into Real-Time Spoken Language Comprehension in Natural Contexts. Journal of Psycholinguistic Research 24 (1995) 409-436
  3. Egeth, H. E., Yantis, S.: Visual attention: Control, representation, and time course. Annual Review of Psychology 48 (1997) 269-297
  4. Foth, K.: Transformationsbasiertes Constraint-Parsing Diplomarbeit Universität Hamburg (1999)
  5. Foth, K.: Hybrid Methods Of Natural Language Analysis PhD Thesis Universität Hamburg (2006)
  6. Gorniak, P., Roy, D.: Grounded Semantic Composition for Visual Scenes. Journal of Artificial Intelligence Research 21 (2004) 429-470
  7. Haddock, N. J.: Computational models of incremental semantic interpretation. Language and Cognitive Processes 4(3) (1989) 337-36.
  8. Itti, L.: Models of Bottom-Up and Top-Down Visual Attention. California Institute of Technology Ph.D. Thesis (2000)
  9. Knöferle, P.: The Role of Visual Scenes in Spoken Language Comprehension: Evidence from Eye-Tracking. PhD thesis Universität des Saarlandes (2005).
  10. Knöferle, P., Crocker, M. W., Scheepers, C., Pickering M. J.: The influence of the immediate visual context on incremental thematic role-assignement evidence from eye-movements in depicted events. Cognition 95 (2005) 95-127
  11. Knöferle, P., Crocker M. W.: The influence of recent scene events on spoken comprehension: evidence from eye movements. Journal of Memory and Language 57(2) (2007) 519-543
  12. McCrae, P.: A model for the cross-modal influence of visual context upon language processing. Proceedings of the International Conference Recent Advances in Natural Language Processing (2009) 230-235
  13. Menzel, W.: Towards radically incremental parsing of natural language. Current Issues in Linguistic Theory 309 (2009) 41-56
  14. W3C-World Wide Web Consortium. OWL Reference, 10.02.2004. http://www.w3.org/TR/ 2002/REC-owl-ref-20040210 (2004).
  15. Scheutz, M., Eberhard, K., Andronache, V.: A Real-time Robotic Model of Human Reference Resolution using Visual Constraints. Connection Science Journal 16(3) (2004) 145-167
  16. Schröder, I.: Natural Language Parsing with Graded Constraints PhD Thesis Universität Hamburg (2002).
  17. Sedivy, J. C., Tanenhaus, M. K., Chambers, C. G., Carlson, G. N.: Achieving incremental semantic interpretation through contextual representation. Cognition 71 (1999) 109-147
  18. Winograd, T.: A Procedural Model of Language Understanding. Computer models of thought and language (1973)
Download


Paper Citation


in Harvard Style

Baumgärtner C. and Menzel W. (2012). Integrating a Model for Visual Attention into a System for Natural Language Parsing . In Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2012) ISBN 978-989-8565-16-7, pages 24-33. DOI: 10.5220/0004088100240033


in Bibtex Style

@conference{nlpcs12,
author={Christopher Baumgärtner and Wolfgang Menzel},
title={Integrating a Model for Visual Attention into a System for Natural Language Parsing},
booktitle={Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2012)},
year={2012},
pages={24-33},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004088100240033},
isbn={978-989-8565-16-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2012)
TI - Integrating a Model for Visual Attention into a System for Natural Language Parsing
SN - 978-989-8565-16-7
AU - Baumgärtner C.
AU - Menzel W.
PY - 2012
SP - 24
EP - 33
DO - 10.5220/0004088100240033