Integrating a Model for Visual Attention into a System for Natural Language Parsing
Christopher Baumgärtner, Wolfgang Menzel
2012
Abstract
We present a system for integrating knowledge about complex visual scenes into the process of natural language comprehension. The implemented system is able to choose a scene of reference for a natural language sentence from a large set of scene descriptions. This scene is then used to influence the analysis of a sentence generated by a broad coverage language parser. In addition, objects and actions referred to by the sentence are visualized by a saliency map which is derived from the bi-directional influence of top down and bottom-up information on a model of visual attention highlighting the regions with the highest probability of attracting the attention of an observer.
References
- Beuck, N., Köhn, A., Menzel. W.: Incremental parsing and the evaluation of partial dependency analyses In Proceedings of the 1st International Conference on Dependency Linguistics, DepLing-2011 (2011) 290-299
- Eberhard, K. M., Spivey-Knowlton, M. J., Sedivy, J. C:, Tanenhaus, M. K.: Eye Movements as a Window into Real-Time Spoken Language Comprehension in Natural Contexts. Journal of Psycholinguistic Research 24 (1995) 409-436
- Egeth, H. E., Yantis, S.: Visual attention: Control, representation, and time course. Annual Review of Psychology 48 (1997) 269-297
- Foth, K.: Transformationsbasiertes Constraint-Parsing Diplomarbeit Universität Hamburg (1999)
- Foth, K.: Hybrid Methods Of Natural Language Analysis PhD Thesis Universität Hamburg (2006)
- Gorniak, P., Roy, D.: Grounded Semantic Composition for Visual Scenes. Journal of Artificial Intelligence Research 21 (2004) 429-470
- Haddock, N. J.: Computational models of incremental semantic interpretation. Language and Cognitive Processes 4(3) (1989) 337-36.
- Itti, L.: Models of Bottom-Up and Top-Down Visual Attention. California Institute of Technology Ph.D. Thesis (2000)
- Knöferle, P.: The Role of Visual Scenes in Spoken Language Comprehension: Evidence from Eye-Tracking. PhD thesis Universität des Saarlandes (2005).
- Knöferle, P., Crocker, M. W., Scheepers, C., Pickering M. J.: The influence of the immediate visual context on incremental thematic role-assignement evidence from eye-movements in depicted events. Cognition 95 (2005) 95-127
- Knöferle, P., Crocker M. W.: The influence of recent scene events on spoken comprehension: evidence from eye movements. Journal of Memory and Language 57(2) (2007) 519-543
- McCrae, P.: A model for the cross-modal influence of visual context upon language processing. Proceedings of the International Conference Recent Advances in Natural Language Processing (2009) 230-235
- Menzel, W.: Towards radically incremental parsing of natural language. Current Issues in Linguistic Theory 309 (2009) 41-56
- W3C-World Wide Web Consortium. OWL Reference, 10.02.2004. http://www.w3.org/TR/ 2002/REC-owl-ref-20040210 (2004).
- Scheutz, M., Eberhard, K., Andronache, V.: A Real-time Robotic Model of Human Reference Resolution using Visual Constraints. Connection Science Journal 16(3) (2004) 145-167
- Schröder, I.: Natural Language Parsing with Graded Constraints PhD Thesis Universität Hamburg (2002).
- Sedivy, J. C., Tanenhaus, M. K., Chambers, C. G., Carlson, G. N.: Achieving incremental semantic interpretation through contextual representation. Cognition 71 (1999) 109-147
- Winograd, T.: A Procedural Model of Language Understanding. Computer models of thought and language (1973)
Paper Citation
in Harvard Style
Baumgärtner C. and Menzel W. (2012). Integrating a Model for Visual Attention into a System for Natural Language Parsing . In Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2012) ISBN 978-989-8565-16-7, pages 24-33. DOI: 10.5220/0004088100240033
in Bibtex Style
@conference{nlpcs12,
author={Christopher Baumgärtner and Wolfgang Menzel},
title={Integrating a Model for Visual Attention into a System for Natural Language Parsing},
booktitle={Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2012)},
year={2012},
pages={24-33},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004088100240033},
isbn={978-989-8565-16-7},
}
in EndNote Style
TY  - CONF 
JO  - Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2012)
TI  - Integrating a Model for Visual Attention into a System for Natural Language Parsing
SN  - 978-989-8565-16-7
AU  - Baumgärtner C. 
AU  - Menzel W. 
PY  - 2012
SP  - 24
EP  - 33
DO  - 10.5220/0004088100240033