Defining and Classifying Space Builders for Information Extraction

Barbara Gawronska, Björn Erlendsson, Niklas Torstensson

2004

Abstract

The paper addresses the question of Information Extraction aimed at multilingual text generation, or text re-writing. This method provides an alternative to traditional Machine Translation, but is also related to text summarization. Given a source text, a re-writing system selects and structures the textual information in order to generate a “content report”. The present approach is inspired by recent IE-research, classical speech act theory, and Cognitive Semantics, especially the Theory of Mental Spaces and employed in an experimental system for understanding of news reports. The authors focus on the problem of identification and interpretation of ‘space builders’, i.e. linguistic signals for establishing mental spaces.

References

  1. Coulthard, M., An Introduction to Discourse Analysis. London: Longman (1985)
  2. Cowie, J., Wilks, Y., Information Extraction. In Dale, R., Moisl, H., and Somers, H. (eds.): Handbook of Natural Language Processing. New York: Marcel Dekker (2000)
  3. Fauconnier, G., Mental Spaces. Aspects of Meaning Construction in Natural Language. Cambridge, MA: MIT Press (1985)
  4. Fauconnier, G., and Sweetser, E., (eds.) Spaces, Worlds, and Grammar. Chicago University Press (1996)
  5. Fauconnier, G., and Turner, M., The Way We Think: Conceptual Blending and the Mind's Hidden Complexities. New York: Basic Books (2002)
  6. Fellbaum, C, WordNet. An Electronic Lexical Database. MIT Press (1998)
  7. Gawronska, B., Employing Cognitive Notions in Multilingual Summarization of News Reports. In Proceedings of NLULP 2002. Copenhagen (2002) 103 - 119.
  8. Harabagiu, S., WordNet-Based Inference of Textual Cohesion and Coherence. In Proceedings of FLAIRS-98, May 1998, Sanibel Island, FL (1998) 265-269.
  9. Harabagiu, S, WordNet-based inference of textual context, cohesion and coherence. Ph.D. thesis, University of Southern California, Los Angeles, CA (1997)
  10. Harabagiu, S., Maiorano, S., and Pasca, M., Open-domain textual question answering techniques. Natural Language Engineering 9 (3) (2003) 231 - 267.
  11. Harder, P., Mental spaces: Exactly when do we need them? Cognitive Linguistics 14-1, 2003 (2003) 91 - 96
  12. Hobbs, J., Sketch of an ontology underlying the way we talk about the world. International Journal of Human-Computer Studies, 43 (1995) 819-830.
  13. Hovy, E., Text Summarization. In Mitkov, R., (ed.) The Oxford Handbook of Computational Linguistics. Oxford University Press (2003)
  14. Ide, N., and Véronis, J., Inroduction to the Special Issue on Word Sense Disambiguation: The State of the Art. Computational Linguistics 24 (1) (1998) 1-40.
  15. Knight, K., and Marcu, D., Statistics-based summarization - step one: sentence compression. In Proceedings of the conference of the American Association for Artificial Intelligence. AAAI. Austin, Texas (2000) 703 - 710
  16. Langacker, R., Concept, Image, and Symbol. The Cognitive Basis of Grammar. Berlin/New York: Mouton de Gruyter (1991)
  17. Lee, M. and Wilks, Y., An ascription-based approach to speech acts. Proceedings of COLING 7894, Kyoto (1996) 344-348.
  18. Lehnert, W., Plot Units: A Narrative Summarization Strategy. In Mani, I., and Maybury, M.T., (eds.) Advances in Automatic Text Summarization. Cambridge, Massachusetts, London, England: The MIT Press (1999) 177-213.
  19. Mani, I., and Maybury, M.T., (eds.), Advances in Automatic Text Summarization. Cambridge, Massachusetts, London, England: The MIT Press (1999)
  20. Mann, W. C., and Thompson, S., Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8 (3) (1988) 243-281.
  21. Marcu, D., The rhethorical parsing, summarization and generation of natural language texts. Ph.D. thesis, University of Toronto (1997)
  22. Marcu, D., Building Up Rhetorical Structure Trees. The Proceedings of the Thirteenth National Conference on Artificial Intelligence, vol 2, Portland, Oregon, August 1996 (1996) 1069-1074.
  23. Mendes, S., and Chaves, R. P., Enriching WordNet with Qualia Information. In Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations at NAACL 2001 (2001) 107 - 112.
  24. Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D. and Miller, K.J., Introduction to WordNet: an on-line lexical database. International Journal of Lexicography 3 (4) (1990) 235 - 244
  25. Mitkov, R., Towards a more consistent and comprehensive evaluation of anaphora resolution algorithms and systems. Applied Artificial Intelligence: An International Journal, 15 (2001) 253 - 276
  26. Montoyo, A., and Palomar, M., WordNet Enrichment with Classification Systems. In Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations at NAACL 2001 (2001) 101-106.
  27. Newmark, P., A Textbook of Translation. Prentice Hall International (UK) Ltd (1988)
  28. Nida, E. A., and Taber, C. R., The Theory and Practice of Translation. Leiden: E. J. Brill (1969)
  29. Nirenburg, S., Mahesh, K., Knowledge-Based Systems for Natural Language Processing. The Computer Science and Engineering Handbook (1997) 637-653.
  30. Sanders, J., and Redeker, G., Perspective and the Representation of Speech and Thought in Narrative Discourse. In: Fauconnier, G., and Sweetser, E., (eds.): Spaces, worlds and grammar. The University of Chicago Press, Chicago and London (1996) 290-317.
  31. Searle, J.R., Speech acts. Cambridge: Cambridge University Press (1969)
  32. Wilks, Y., Relevance, points of view and speech acts: An artificial intelligence view. Technical Report MCCS-85-25, New Mexico State University (1985)
  33. Vossen, P., EuroWordNet. A Multilingual Database with Lexical Semantic Networks. Dordrecht: Kluwer Academic Publishers (1998)
Download


Paper Citation


in Harvard Style

Gawronska B., Erlendsson B. and Torstensson N. (2004). Defining and Classifying Space Builders for Information Extraction . In Proceedings of the 1st International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2004) ISBN 972-8865-05-8, pages 15-28. DOI: 10.5220/0002667500150028


in Bibtex Style

@conference{nlucs04,
author={Barbara Gawronska and Björn Erlendsson and Niklas Torstensson},
title={Defining and Classifying Space Builders for Information Extraction},
booktitle={Proceedings of the 1st International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2004)},
year={2004},
pages={15-28},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002667500150028},
isbn={972-8865-05-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2004)
TI - Defining and Classifying Space Builders for Information Extraction
SN - 972-8865-05-8
AU - Gawronska B.
AU - Erlendsson B.
AU - Torstensson N.
PY - 2004
SP - 15
EP - 28
DO - 10.5220/0002667500150028