MALAPROPISMS DETECTION AND CORRECTION USING A PARONYMS DICTIONARY, A SEARCH ENGINE AND WORDNET

Valentin Cojocaru, Costin-Gabriel Chiru, Stefan Trausan-Matu, Traian Rebedea

Abstract

This paper presents a method for the automatic detection and correction of malapropism errors found in documents using the WordNet lexical database, a search engine (Google) and a paronyms dictionary. The malapropisms detection is based on the evaluation of the cohesion of the local context using the search engine, while the correction is done using the whole text cohesion evaluated in terms of lexical chains built using the linguistic ontology. The correction candidates, which are taken from the paronyms dictionary, are evaluated versus the local and the whole text cohesion in order to find the best candidate that is chosen for replacement. The testing methods of the application are presented, along with the obtained results.

References

  1. Bolshakov, I. A., Galicia-Haro, S. N., Gelbukh, A., 2005. Detection and Correction of Malapropisms in Spanish by means of Internet Search. 8th International Conference Text, Speech and Dialogue (TSD-2005), Karlovy Vary, Czech Rep. In: Lecture Notes in Artificial Intelligence (indexed in SCIE), N 3658, ISSN 0302-9743, ISBN 3-540-28789-2, SpringerVerlag, pp. 115-122.
  2. Bolshakov, I., Gelbukh, A., 2003. On Detection of Malapropisms by Multistage Collocation Testing. NLDB-2003, 8th International Conference on Application of Natural Language to Information Systems, June 23-25, 2003, Burg, Germany. In: Lecture Notes in Informatics., Bonner Köllen Verlag, ISSN 1617-5468, ISBN 3-88579-358-X, pp. 28-41.
  3. Bolshakova, E., Bolshakov, I. A., Kotlyarov, A., 2005. Experiments in Detection and Correction of Russian Malapropisms by Means of the Web. In: International Journal on Information Theories & Applications. V.12, N 2, p 141-149.
  4. Gale, W. A., Church, K. W. and Yarowsky, D., 1993. A method for disambiguating word senses in a large corpus. In Computers and the Humanities, 26:415- 439.
  5. Golding, A., 1995. A bayesian hybrid method for contextsensitive spelling correction. In The Third Workshop on Very Large Corpora, pages 39-53.
  6. Golding, A. and Schabes, Y., 1996. Combining trigrambased and feature-based methods for context sensitive spelling correction. In 34th Annual Meeting of the Association for Computational Linguistics.
  7. Hirst, G., Budanitsky, A., 2005. Correcting Real-Word Spelling Errors by Restoring Lexical Cohesion. In: Computational Linguistics. Natural Language Engineering, 11:87-111.
  8. Hirst, G., St-Onge, D., 1998. Lexical Chains as Representation of Context for Detection and Corrections of Malapropisms. In: C. Fellbaum (ed.) WordNet: An Electronic Lexical Database. The MIT Press, p. 305-332.
  9. Marshall, I., 1983. Choice of grammatical word-class without global syntactic analysis: tagging words in the LOB corpus. In Computers and the Humanities, 17:139-150.
  10. Mays, E., Damerau, F. J. and Mercer, R. L., 1991. Context based spelling correction. In Information Processing and Management, 27(5):517-522.
  11. Wilcox-O'Hearn, A., Hirst, G. and Budanitsky, A., 2006. Real-word spelling correction with trigrams: A reconsideration of the Mays, Damerau, andMercer model. In CICLing-2008, 9th International Conference on Intelligent Text Processing and Computational Linguistics, pp. 605-616, Haifa, Israel.
Download


Paper Citation


in Harvard Style

Cojocaru V., Chiru C., Trausan-Matu S. and Rebedea T. (2010). MALAPROPISMS DETECTION AND CORRECTION USING A PARONYMS DICTIONARY, A SEARCH ENGINE AND WORDNET . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-8425-23-2, pages 364-373. DOI: 10.5220/0002931803640373


in Bibtex Style

@conference{icsoft10,
author={Valentin Cojocaru and Costin-Gabriel Chiru and Stefan Trausan-Matu and Traian Rebedea},
title={MALAPROPISMS DETECTION AND CORRECTION USING A PARONYMS DICTIONARY, A SEARCH ENGINE AND WORDNET},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2010},
pages={364-373},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002931803640373},
isbn={978-989-8425-23-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - MALAPROPISMS DETECTION AND CORRECTION USING A PARONYMS DICTIONARY, A SEARCH ENGINE AND WORDNET
SN - 978-989-8425-23-2
AU - Cojocaru V.
AU - Chiru C.
AU - Trausan-Matu S.
AU - Rebedea T.
PY - 2010
SP - 364
EP - 373
DO - 10.5220/0002931803640373