Miguel Ballesteros, Susana Bautista, Pablo Gervás


In this paper we investigate the task of text simplification for Spanish. Our purpose is a system to simplified text based on rules using dependency parsing. Our main motivation is the need for text simplification to facilitate accessibility to information by poor readers and by people with cognitive disabilities. This study consists of the first step towards building Spanish text simplification systems helping to create easy-to-read texts.


  1. Bautista, S., Gervás, P., and Madrid, R. (2009). Feasibility Analysis for SemiAutomatic Conversion of Text to Improve Readability. In The Second International Conference on Information and Communication Technologies and Accessibility.
  2. Buchholz, S. and Marsi, E. (2006). Conll-x shared task on multilingual dependency parsing. In Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL-X), pages 149-164.
  3. Canning, Y. (2000). Cohesive simplification of newspaper text for aphasic readers. In 3rd annual CLUK Doctoral Research Colloquium.
  4. Caseli, H. M., Pereira, T. F., Specia, L., Pardo, T. A. S., Gasperin, C., and M.Aluisio, S. (2009). Building a Brazilian Portuguese Parallel Corpus of Original and Simplified Texts. In In Proceedings of CICLing.
  5. Chandrasekar, R., Doran, C., and Srinivas, B. (1996). Motivations and methods for text simplification. In In Proceedings of the Sixteenth International Conference on Computational Linguistics (COLING 7896, pages 1041-1044.
  6. Chandrasekar, R. and Srinivas, B. (1997). Automatic induction of rules for text simplification. Knowledge-Based Systems, 10.
  7. Devlin, S. and Tait, J. (1998). Linguist Databases, chapter The use of a Psycholinguistic database in the Simplification of Text for Aphasic Readers, pages 161-173. CSLI.
  8. Devlin, S. and Unthank, G. (2006). Helping aphasic people process online information. In Assets 7806: Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility, pages 225- 226, New York, NY, USA. ACM.
  9. Fellbaum, C., editor (1998). WordNet: An Electronic Lexical Database. MIT Press.
  10. Inui, K., Fujita, A., Takahashi, T., Iida, R., and Iwakura, T. (2003). Text simplification for reading assistance: a project note. In Proceedings of the second international workshop on Paraphrasing, pages 9-16, Morristown, NJ, USA. Association for Computational Linguistics.
  11. Klebanov, B. B., Knight, K., and Marcu, D. (2004). Text simplification for information-seeking applications. In On the Move to Meaningful Internet Systems, Lecture Notes in Computer Science, pages 735-747. Springer Verlag.
  12. Kübler, S., McDonald, R. T., and Nivre, J. (2009). Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.
  13. Max, A. (2006). Writing for language-impaired readers. In CICLing, pages 567-570.
  14. Mcdonald, R. K. R., Nilsson, J., Riedel, S., and Yuret, D. (2007). The conll 2007 shared task on dependency parsing.
  15. Palomar, M., Civit, M., Díaz, A., Moreno, L., Bisbal, E., Aranzabe, M., Ageno, A., Martí, M., and Navarro, B. (2004). 3lb: Construcción de una base de datos de árboles sintáctico-semánticos para el catalán, euskera y espan˜ol. In Proceedings of the XX Conference of the Spanish Society for Natural Language Processing (SEPLN), pages 81-88. Sociedad Espan˜ola para el Procesamiento del Lenguaje Natural.
  16. Petersen, S. E. and Ostendorf, M. (2007). Text simplification for language learners: a corpus analysis. In In Proc. of Workshop on Speech and Language Technology for Education.
  17. Siddharthan, A. (2002). Resolving attachment and clause boundary amgiguities for simplifying relative clause constructs. In Proceedings of the Student Research Workshop, 40th Meeting of the Association for Computacional Linguistics.
  18. Siddharthan, A. (2003). Syntactic Simplification and Text Cohesion. PhD thesis, Research on Language and Computation.
  19. Snow, C. E., States., U., Science, and Corporation), T. P. I. R. (2002). Reading for understanding : toward an R&D program in reading comprehension / Catherine Snow. Rand, Santa Monica, CA :.
  20. Taulé, M., Martí, M., and Recasens, M. (2008). AnCora: Multilevel Annotated Corpora for Catalan and Spanish. In Proceedings of 6th International Conference on Language Resources and Evaluation.
  21. Vossen, P., editor (1998). EuroWordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers, Norwell, MA, USA.
  22. Williams, S., Reiter, E., and Osman, L. M. (2003). Experiments with discourse-level choices and readability. In In Proceedings of the European Natural Language Generation Workshop (ENLG) and 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL03), pages 127-134.

Paper Citation

in Harvard Style

Ballesteros M., Bautista S. and Gervás P. (2010). TEXT SIMPLIFICATION USING DEPENDENCY PARSING FOR SPANISH . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 330-335. DOI: 10.5220/0003115803300335

in Bibtex Style

author={Miguel Ballesteros and Susana Bautista and Pablo Gervás},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},

in EndNote Style

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
SN - 978-989-8425-28-7
AU - Ballesteros M.
AU - Bautista S.
AU - Gervás P.
PY - 2010
SP - 330
EP - 335
DO - 10.5220/0003115803300335