GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser
Miguel Ballesteros, Jesús Herrera, Virginia Francisco, Pablo Gervás
2010
Abstract
Maltparser is a contemporary dependency parsing machine learning–based system that shows great accuracy. However 90% of the Labelled Attachment Score (LAS) seems to be a de facto limit for these kinds of parsers. In this paper we present an n–version dependency parser that will work as follows: we found that there is a small set of words that are more frequently incorrectly parsed so the n-version dependency parser consists of n different parsers trained specifically to parse those difficult words. An algorithm will send each word to each parser and combined with the action of a general parser we will achieve better overall accuracy. This work has been developed specifically for Spanish using Maltparser.
References
- Ballesteros, M., Herrera, J., Francisco, V., and Gervás, P. (2010a). A feasibility study on low level techniques for improving parsing accuracy for spanish using maltparser. In Artificial Intelligence: Theories, Models and Applications, volume 6040 of Lecture Notes in Artificial Intelligence, pages 39-48. Springer.
- Ballesteros, M., Herrera, J., Francisco, V., and Gervás, P. (2010b). Improving Parsing Accuracy for Spanish using Maltparser. Journal of the Spanish Society for NLP (SEPLN), 44:83-90.
- Ballesteros, M., Herrera, J., Francisco, V., and Gervás, P. (2010c). Towards a N-Version Dependency Parser. 13th International Conference on Text, Speech and Dialogue 2010., 6231.
- Buchholz, S. and Marsi, E. (2006). Conll-x shared task on multilingual dependency parsing. In CoNLL-X 7806: Proceedings of the Tenth Conference on Computational Natural Language Learning, pages 149-164, Morristown, NJ, USA. Association for Computational Linguistics.
- Eisner, J. (1996). Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pages 340-345, Copenhagen.
- Herrera, J. and Gervás, P. (2008). Towards a Dependency Parser for Greek Using a Small Training Data Set. Journal of the Spanish Society for Natural Language Processing (SEPLN), 41:29-36.
- Herrera, J., Gervás, P., Moriano, P., Moreno, A., and Romero, L. (2007a). Building Corpora for the Development of a Dependency Parser for Spanish Using Maltparser. Journal of the Spanish Society for Natural Language Processing (SEPLN), 39:181-186.
- Herrera, J., Gervás, P., Moriano, P., Moreno, A., and Romero, L. (2007b). JBeaver: un Analizador de Dependencias para el Espan˜ol Basado en Aprendizaje. In Proceedings of the 12th Conference of the Spanish Society for Artificial Intelligence (CAEPIA 07), Salamanca, Spain, pages 211-220. Asociación Espan˜ola para la Inteligencia Artificial.
- McDonald, R., Lerman, K., and Pereira, F. (2006). Multilingual dependency analysis with a two-stage discriminative parser. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 216-220.
- McDonald, R. and Nivre, J. (2007). Characterizing the errors of data-driven dependency parsing models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 122-131. Association for Computational Linguistics.
- Nivre, J., Hall, J., and Nilsson, J. (2004). Memory-based dependency parsing. In Proceedings of CoNLL-2004, pages 49-56. Boston, MA, USA.
- Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kbler, S., Marinov, S., and Marsi, E. (2007). Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2):95-135.
- Nivre, J., Hall, J., Nilsson, J., Eryig?it, G., and Marinov, S. (2006). Labeled pseudo-projective dependency parsing with support vector machines. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 221-225.
- Taulé, M., Martí, M. A., and Recasens, M. (2008). Ancora: Multilevel annotated corpora for Catalan and Spanish. In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
- Yamada, H. and Matsumoto, Y. (2003). Statistical dependency analysis with support vector machines. In Proceedings of International Workshop of Parsing Technologies (IWPT'03), pages 195-206.
Paper Citation
in Harvard Style
Ballesteros M., Herrera J., Francisco V. and Gervás P. (2010). GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 336-341. DOI: 10.5220/0003116803360341
in Bibtex Style
@conference{kdir10,
author={Miguel Ballesteros and Jesús Herrera and Virginia Francisco and Pablo Gervás},
title={GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={336-341},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003116803360341},
isbn={978-989-8425-28-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - GIVING SHAPE TO AN N–VERSION DEPENDENCY PARSER - Improving Dependency Parsing Accuracy for Spanish using Maltparser
SN - 978-989-8425-28-7
AU - Ballesteros M.
AU - Herrera J.
AU - Francisco V.
AU - Gervás P.
PY - 2010
SP - 336
EP - 341
DO - 10.5220/0003116803360341