A Linguistically-based Approach to Discourse Relations Recognition

Rodolfo Delmonte, Gabriel Nicolae, Sanda Harabagiu, Cristina Nicolae

Abstract

We present an unsupervised linguistically-based approach to discourse relations recognition, which uses publicly available resources like manually annotated corpora (Discourse Graphbank, Penn Discourse Treebank, RST-DT), as well as empirically derived data from “causally” annotated lexica like LCS, to produce a rule-based algorithm. In our approach we use the subdivision of Discourse Relations into four subsets – CONTRAST, CAUSE, CONDITION, ELABORATION, proposed by [7] in their paper, where they report results obtained with a machine-learning approach from a similar experiment, against which we compare our results. Our approach is fully symbolic and is partially derived from the system called GETARUNS, for text understanding, adapted to a specific task: recognition of Causality Relations in free text. We show that in order to achieve better accuracy, both in the general task and in the specific one, semantic information needs to be used besides syntactic structural information. Our approach outperforms results reported in previous papers [9].

References

  1. Carlson L., Marcu, D. and Okurowski M. 2002. RST Discourse Treebank. Philadelphia, PA:Linguistic Data Consortium.
  2. Delmonte R. 2003. Parsing Spontaneous Speech. In proceedings of EUROSPEECH2003.
  3. Delmonte R. 2005. Deep & Shallow Linguistically Based Parsing. In A.M. Di Sciullo (ed.), UG and External Systems, John Benjamins, Amsterdam/Philadelphia, pp. 335-374.
  4. Delmonte R. 2007 (to be published). Computational Linguistic Text Processing. Nova Publishers, New York.
  5. Dorr B. J. and Olsen M. B. 1997. Deriving Verbal and Compositional Lexical Aspect for NLP Applications. In proceedings of the 35th ACL, pp. 151-158.
  6. Marcu, D. 2000. The theory and practice of discourse parsing and summarization. Cambridge, MA: MIT Press.
  7. Marcu D. and Echihabi A., 2002. An Unsupervised Approach to Recognizing Discourse Relations. In proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL'02), Philadelphia, PA, pp. 368-375.
  8. Miltsakaki E., Prasad R., Joshi A. and Webber B. 2004. The Penn Discourse Treebank. In proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal.
  9. Soricut R. and Marcu D. 2003. Sentence level discourse parsing using syntactic and lexical information. In proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pp. 149-156.
  10. Wolf F., Gibson E., Fisher A., Knight M. 2005. Discourse Graphbank. Linguistic Data Consortium, Philadelphia.
Download


Paper Citation


in Harvard Style

Delmonte R., Nicolae G., Harabagiu S. and Nicolae C. (2007). A Linguistically-based Approach to Discourse Relations Recognition . In Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2007) ISBN 978-972-8865-97-9, pages 81-91. DOI: 10.5220/0002421300810091


in Bibtex Style

@conference{nlpcs07,
author={Rodolfo Delmonte and Gabriel Nicolae and Sanda Harabagiu and Cristina Nicolae},
title={A Linguistically-based Approach to Discourse Relations Recognition},
booktitle={Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2007)},
year={2007},
pages={81-91},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002421300810091},
isbn={978-972-8865-97-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science - Volume 1: NLPCS, (ICEIS 2007)
TI - A Linguistically-based Approach to Discourse Relations Recognition
SN - 978-972-8865-97-9
AU - Delmonte R.
AU - Nicolae G.
AU - Harabagiu S.
AU - Nicolae C.
PY - 2007
SP - 81
EP - 91
DO - 10.5220/0002421300810091