GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser

Miguel Ballesteros, Jesús Herrera, Virginia Francisco, Pablo Gervás

2010

Abstract

Maltparser is a contemporary dependency parsing machine learning–based system that shows great accuracy. However 90% of the Labelled Attachment Score (LAS) seems to be a de facto limit for these kinds of parsers. In this paper we present an n–version dependency parser that will work as follows: we found that there is a small set of words that are more frequently incorrectly parsed so the n-version dependency parser consists of n different parsers trained specifically to parse those difficult words. An algorithm will send each word to each parser and combined with the action of a general parser we will achieve better overall accuracy. This work has been developed specifically for Spanish using Maltparser.

References

  1. Ballesteros, M., Herrera, J., Francisco, V., and Gervás, P. (2010a). A feasibility study on low level techniques for improving parsing accuracy for spanish using maltparser. In Artificial Intelligence: Theories, Models and Applications, volume 6040 of Lecture Notes in Artificial Intelligence, pages 39-48. Springer.
  2. Ballesteros, M., Herrera, J., Francisco, V., and Gervás, P. (2010b). Improving Parsing Accuracy for Spanish using Maltparser. Journal of the Spanish Society for NLP (SEPLN), 44:83-90.
  3. Ballesteros, M., Herrera, J., Francisco, V., and Gervás, P. (2010c). Towards a N-Version Dependency Parser. 13th International Conference on Text, Speech and Dialogue 2010., 6231.
  4. Buchholz, S. and Marsi, E. (2006). Conll-x shared task on multilingual dependency parsing. In CoNLL-X 7806: Proceedings of the Tenth Conference on Computational Natural Language Learning, pages 149-164, Morristown, NJ, USA. Association for Computational Linguistics.
  5. Eisner, J. (1996). Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pages 340-345, Copenhagen.
  6. Herrera, J. and Gervás, P. (2008). Towards a Dependency Parser for Greek Using a Small Training Data Set. Journal of the Spanish Society for Natural Language Processing (SEPLN), 41:29-36.
  7. Herrera, J., Gervás, P., Moriano, P., Moreno, A., and Romero, L. (2007a). Building Corpora for the Development of a Dependency Parser for Spanish Using Maltparser. Journal of the Spanish Society for Natural Language Processing (SEPLN), 39:181-186.
  8. Herrera, J., Gervás, P., Moriano, P., Moreno, A., and Romero, L. (2007b). JBeaver: un Analizador de Dependencias para el Espan˜ol Basado en Aprendizaje. In Proceedings of the 12th Conference of the Spanish Society for Artificial Intelligence (CAEPIA 07), Salamanca, Spain, pages 211-220. Asociación Espan˜ola para la Inteligencia Artificial.
  9. McDonald, R., Lerman, K., and Pereira, F. (2006). Multilingual dependency analysis with a two-stage discriminative parser. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 216-220.
  10. McDonald, R. and Nivre, J. (2007). Characterizing the errors of data-driven dependency parsing models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 122-131. Association for Computational Linguistics.
  11. Nivre, J., Hall, J., and Nilsson, J. (2004). Memory-based dependency parsing. In Proceedings of CoNLL-2004, pages 49-56. Boston, MA, USA.
  12. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kbler, S., Marinov, S., and Marsi, E. (2007). Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2):95-135.
  13. Nivre, J., Hall, J., Nilsson, J., Eryig?it, G., and Marinov, S. (2006). Labeled pseudo-projective dependency parsing with support vector machines. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 221-225.
  14. Taulé, M., Martí, M. A., and Recasens, M. (2008). Ancora: Multilevel annotated corpora for Catalan and Spanish. In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
  15. Yamada, H. and Matsumoto, Y. (2003). Statistical dependency analysis with support vector machines. In Proceedings of International Workshop of Parsing Technologies (IWPT'03), pages 195-206.
Download


Paper Citation


in Harvard Style

Ballesteros M., Herrera J., Francisco V. and Gervás P. (2010). GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 336-341. DOI: 10.5220/0003116803360341


in Bibtex Style

@conference{kdir10,
author={Miguel Ballesteros and Jesús Herrera and Virginia Francisco and Pablo Gervás},
title={GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={336-341},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003116803360341},
isbn={978-989-8425-28-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser
SN - 978-989-8425-28-7
AU - Ballesteros M.
AU - Herrera J.
AU - Francisco V.
AU - Gervás P.
PY - 2010
SP - 336
EP - 341
DO - 10.5220/0003116803360341