Understanding Long Sentences

Svetlana Sheremetyeva

Abstract

This paper describes a natural language understanding component for parsing long sentences. The NLU component includes a generation module so that the results of understanding can be displayed to the user in a natural language and interactively corrected before the final parse is sent to a subsequent module of a particular application. Parsing proper is divided into a phrase level and a level of individual clauses included in a sentence. The output of the parser is an interlingual representation that captures the content of a whole sentence. The load of detecting the sentence clause hierarchy level is shifted to the generator. The methodology is universal in the sense that it could be used for different domains, languages and applications. We illustrate it on the example of parsing a patent claim, - an extreme case of a long sentence.

References

  1. Abney, S.: Part-of-speech Tagging and Partial Parsing. In: Corpus-Based methods in Language and Speech. Klue Academic Publishes (1996).
  2. Kim Y., Ehara T.: A Method for Partitioning of Long Japanese Sentences with Subject Resolution in J/E Machine Translation. In: Proceedings of the 1994 ICCPOL (1994)
  3. Hobbs J. R., Bear J.: Two Principles of Parse Preference. In: Linguistica Computazionale: Current Issues in Computational Linguistics: In Honour of Don Walker, Vol. 9-10 (1995)
  4. Roh Y., Hong M., Choi S., Lee K., Park S.: For the Proper Treatment of Long Sentences in a Sentence Pattern-based English-Korean MT System. In; Proceedings of MT Summit IX. New Orleans ( 2003)
  5. Charniak E.: A Maximum-entropy-inspired Parser. Proceedings of the North American Chapter of the Association of Computational Linguistics (2000)
  6. Daum M., Killian A.F., Menzel W.: Constraint based Integration of Deep and Sallow Parsing Techniques. In: Proceedings of the European Chapter of the Association of Computational Linguistics, Budapest (2003)
  7. Joshi A., Srinivas B.: Disambiguation of Super Parts of Speech (or Supertags): Almost Parsing. http://acl.ldc.upenn.edu/C/C94/C94-1024.pdf. (1994)
  8. Fujii A., Ishikawa T.: NTCIR-3 Patent Retrieval Experiments at ULIS. In Proceedings of the Third NTRCIR Workshop (2002)
  9. Sheremetyeva, S.: A Flexible Approach To Multi-Lingual Knowledge Acquisition For NLG. In: Proceedings of the 7th European Workshop on Natural Language Generation. Toulouse. (1999)
  10. Charles J. Fillmore. 1970. Subjects, speakers and roles. Synthese.21/3/4 (1970)
  11. Sheremetyeva S: Embedding MT for Generating Patent Claims in English from a Multilingual Interface. In: Proceedings of the Workshop on Patent Translation in Conjunction with MT Summit, Phuket (2005)
Download


Paper Citation


in Harvard Style

Sheremetyeva S. (2006). Understanding Long Sentences . In Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006) ISBN 978-972-8865-50-4, pages 120-129. DOI: 10.5220/0002505101200129


in Bibtex Style

@conference{nlucs06,
author={Svetlana Sheremetyeva},
title={Understanding Long Sentences},
booktitle={Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006)},
year={2006},
pages={120-129},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002505101200129},
isbn={978-972-8865-50-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006)
TI - Understanding Long Sentences
SN - 978-972-8865-50-4
AU - Sheremetyeva S.
PY - 2006
SP - 120
EP - 129
DO - 10.5220/0002505101200129