The labels are going to be introduced into the Polish Inflection Dictionary. When this
process is finished, it should be possible to assess the performance of the expanded
dictionary. If we also connect it to SSJP, there should be a possibility to process Polish
text using rich semantic information for the most common words and the labels for
the less frequently used ones and proper names. For example, in the sentence “Blad
pilota cessny byl glówna przyczyna katastrofy w Balicach” (Pilot error was the main
cause of the disaster in Balice) the text processing algorithm would be able to know
that “cessna” is a plane and “Balice” is an airport (using the semantic labels) and, after
that, it could find the relations between plane, airport and disaster (using SSJP) and
finally decide, that the sentence contains information about a plane crash. This kind of
processing looks very promising and is a motivation for carrying on the research in this
matter.
References
1. De Vel, O., Anderson, A., Corney, M., and Mohay, G. (2001). Mining e-mail content for
author identification forensics. SIGMOD Rec., 30(4):55–64.
2. Edmonds, P. and Kilgarriff, A. (2002). Introduction to the special issue on evaluating word
sense disambiguation systems. Nat. Lang. Eng., 8(4):279–291.
3. Fellbaum, C., editor (1998). WordNet: an electronic lexical database. MIT Press.
4. Gajecki, M. (2009). Słownik fleksyjny jako biblioteka jezyka c. In Słowniki komputerowe
i automatyczna ekstrakcja informacji z tekstu. Wydawnictwa AGH, Krakow.
5. Kazama, J. and Torisawa, K. (2007). Exploiting wikipedia as external knowledge for named
entity recognition. In EMNLP-CoNLL, pages 698–707. ACL.
6. Kuta, M., Chrzaszcz, P., and Kitowski, J. (2007). A case study of algorithms for morphosyn-
tactic tagging of polish language. Computing and Informatics, 26(6):627–647.
7. Kuta, M., Kitowski, J., Wójcik, W., and Wrzeszcz, M. (2010). Application of weighted
voting taggers to languages described with large tagsets. Computing and Informatics,
29(2):203–225.
8. Lubaszewski, W., Wróbel, H., Gajecki, M., Moskal, B., Orzechowska, A., Pietras, P., Pisarek,
P., and Rokicka, T. (2001). Słownik Fleksyjny Jezyka Polskiego. Lexis Nexis, Kraków.
9. Medelyan, O., Milne, D., Legg, C., and Witten, I. H. (2009). Mining meaning from
wikipedia. Int. J. Hum.-Comput. Stud., 67(9):716–754.
10. Milne, D. and Witten, I. H. (2008). Learning to link with wikipedia. In Proceedings of
the 17th ACM conference on Information and knowledge management, CIKM ’08, pages
509–518, New York, NY, USA. ACM.
11. Pietras, P. (2009). Ekstrakcja leksykalna. In SÅ‚owniki komputerowe i automatyczna ek-
strakcja informacji z tekstu. Wydawnictwa AGH, Kraków.
12. Pohl, A. (2009). SÅ‚ownik semantyczny jÄ
TM
zyka polskiego. In SÅ‚owniki komputerowe
i automatyczna ekstrakcja informacji z tekstu. Wydawnictwa AGH, Kraków.
13. Suchanek, F. M., Kasneci, G., and Weikum, G. (2008). Yago: A large ontology from
wikipedia and wordnet. Web Semant., 6(3):203–217.
14. Toral, A. and Muñoz, R. (2006). A proposal to automatically build and maintain gazetteers
for named entity recognition by using Wikipedia. In NEW TEXT - Wikis and blogs and
other dynamic text sources, Trento.
15. Voorhees, E. M. (1999). Natural language processing and information retrieval. In Informa-
tion Extraction: Towards Scalable, Adaptable Systems, pages 32–48. Springer, New York.
16. Wolinski, M. (2006). Morfeusz - a practical tool for the morphological analysis of polish.
Advances in Soft Computing, 26(6):503-512.
119