full-fledged POS patterns
• to apply constraint rules through direct lexical (word string) match of an n-gram
component against position-referenced lexicons (avoid tagging and syntactic parsing)
The methodology features intelligent output and computationally attractive properties.
An overall testing and evaluation showed that the methodology stays robust for
English, French, Spanish, as well as for Russian. The basic extraction procedure out-
puts all inflected forms of proper types of lexical units.
Different applications can benefit from the techniques proposed here, ranging
from knowledge acquisition for cognitive modeling to indexing, unilingual and multi-
lingual information retrieval, extraction, summarization, machine translation, lan-
guage learning/teaching and the like.
References
1. Motivation in Grammar and the Lexicon (Human Cognitive Processing), .Ed. Panther KU.,
G. Radden. John Benjamin’s publishing Company (2011) 313.
2. Cholakov K, Kordoni, V., Zhang, Y.: Towards domain-independent deep linguistic pro-
cessing: Ensuring portability and re-usability of lexicalized grammars. In: Proceedings of
COLING 2008 Workshop on Grammar Engineering Across Frameworks (GEAF08), Man-
chester, UK (2008).
3. Lefever E., Macken, L., Hoste, V.: Language-independent bilingual terminology extraction
from a multilingual parallel corpus. In: Proceedings of the 12th Conference of the Europe-
an Chapter of the ACL, Athens, Greece (2009) 496–504.
4. Valderrabanos V. A. S., Belskis, A., Iraola L.: TExtractor: a multilingual terminology
extraction tool. In: Proceedings of the second international conference on Human Language
Technology Research, San Diego, California (2002) 393-398
5. Seretan, V., Wehrli, E. Multilingual collocation extraction with a syntactic parser. In:
Language Resources and Evaluation, 43(1) (2009) 71–85.7.
6. Daille B., E. Morin. An effective compositional model for lexical alignment. IJCNLP 2008:
Third International Joint Conference on Natural Language Processing, January 7-12, Hy-
derabad, India (2008) 95-102.
7. Michou A., Seretan, V.: Tool for Multi-Word Expression Extraction in Modern Greek
Using Syntactic Parsing. In: Proceedings of the EACL Demonstrations Sessions. Athens,
Greece (2009).
8. Rayson, P., Archer, D., Piao, S., and McEnery, T.The UCREL semantic analysis system.
In: Proceedings of the LREC-04 Workshop, beyond Named Entity Recognition Semantic
Labelling for NLP Tasks, Lisbon, Portugal, (2004) 7–12.
9. Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computation-
al Linguistics, 19(1) (1993) 61–74.
10. Thuy, V., Aw, A., Zhang, Min.: Term extraction through unithood and termhood unifica-
tion. In: Proceedings of the 3rd International Joint Conference on Natural Language Pro-
cessing (IJCNLP-08), Hyderabad, India (2008).
11. Piao, S. L., Rayson, P., Archer, D., McEnery, T.: Comparing and Combining A Semantic
Tagger and A Statistical Tool for MWE Extraction. Computer Speech & Language Volume
19, Issue 4, (2005) 378-39715.
12. Sharoff, S.: What is at stake: a case study of Russian expressions starting with a preposi-
tion. In: Proceedings of the Second ACL Workshop on Multiword Expressions Integrating
Processing, July (2004).
52