from geographical data on the media. In 17th East-
European Conference on Advances in Databases and
Information Systems (ADBIS 2013), Genoa (Italy),
volume 8133, pages 56–69. Springer.
Garrido, A. L., Gomez, O., Ilarri, S., and Mena, E. (2011).
NASS: News Annotation Semantic System. In 23rd
IEEE International Conference on Tools with Artifi-
cial Intelligence (ICTAI 2011), Boca Raton, Florida
(USA), pages 904–905. IEEE Computer Society.
Garrido, A. L., Gomez, O., Ilarri, S., and Mena, E. (2012).
An experience developing a semantic annotation sys-
tem in a media group. In Proceedings of the 17th
International Conference on Applications of Natural
Language Processing and Information Systems, pages
333–338. Springer.
Gilchrist, A. (2003). Thesauri, taxonomies and ontologies
- an etymological note. Journal of Documentation,
59(1):7–18.
Goodchild, M. F. and Hill, L. (2008). Introduction to dig-
ital gazetteer research. International Journal of Geo-
graphical Information Science, 22(10):1039–1044.
Gruber, T. R. et al. (1993). A translation approach to
portable ontology specifications. Knowledge Acqui-
sition, 5(2):199–220.
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., and
Weischedel, R. (2006). OntoNotes: The 90% solu-
tion. In Human Language Technology Conference of
the NAACL, Companion Volume: Short Papers, pages
57–60. Association for Computational Linguistics.
Joachims, T. (1998). Text categorization with support vec-
tor machines: Learning with many relevant features.
In Tenth European Conference on Machine Learning
(ECML’98), pages 137–142. Springer.
Joachims, T. (2004). SVM Light Version: 6.01.
http://svmlight.joachims.org/.
Lee, S. O. K. and Chun, A. H. W. (2007). Automatic
tag recommendation for the web 2.0 blogosphere us-
ing collaborative tagging and hybrid and semantic
structures. Sixth Conference on WSEAS International
Conference on Applied Computer Science (ACOS’07),
World Scientific and Engineering Academy and Soci-
ety (WSEAS), 7:88–93.
Leopold, E. and Kindermann, J. (2002). Text categorization
with support vector machines. How to represent texts
in input space? Machine Learning, 46:423–444.
Li, H., Srihari, R. K., Niu, C., and Li, W. (2002). Location
normalization for information extraction. In Proceed-
ings of the 19th international conference on Compu-
tational linguistics-Volume 1, pages 1–7. Association
for Computational Linguistics.
Manning, C. D., Raghavan, P., and Sch
¨
utze, H. (2008). In-
troduction to Information Retrieval, volume 1. Cam-
bridge University Press.
Maynard, D., Peters, W., and Li, Y. (2006). Metrics for
evaluation of ontology-based information extraction.
In Workshop on Evaluation of Ontologies for the Web
(EON) at the International World Wide Web Confer-
ence (WWW’06).
McGuinness, D. L., Van Harmelen, F., et al. (2004). OWL
web ontology language overview. W3C recommenda-
tion 10 February 2004.
Miller, G. A. (1995). WordNet: a lexical database for en-
glish. Communications of ACM, 38(11):39–41.
Mishra, R. B. and Kumar, S. (2011). Semantic web rea-
soners and languages. Artificial Intelligence Review,
35(4):339–368.
Navigli, R. (2009). Word sense disambiguation: A survey.
ACM Computing Surveys, 41(2):10:1–10:69.
Quercini, G., Samet, H., Sankaranarayanan, J., and Lieber-
man, M. D. (2010). Determining the spatial reader
scopes of news sources using local lexicons. In Pro-
ceedings of the 18th SIGSPATIAL International Con-
ference on Advances in Geographic Information Sys-
tems, pages 43–52. ACM.
Rauch, E., Bukatin, M., and Baker, K. (2003). A
confidence-based framework for disambiguating ge-
ographic terms. In HLT-NAACL 2003 Workshop on
Analysis of Geographic References, vol. 1, pages 50–
54. Association for Computational Linguistics.
Resnik, P. (1999). Disambiguating noun groupings with
respect to WordNet senses. In Natural Language
Processing Using Very Large Corpora, pages 77–98.
Springer.
Salton, G. and Buckley, C. (1988). Term-weighting ap-
proaches in automatic text retrieval. Information Pro-
cessing and Management, 24(5):513–523.
Scharkow, M. (2013). Thematic content analysis using su-
pervised machine learning: An empirical evaluation
using German online news. Quality and Quantity,
47(2):761–773.
Sebastiani, F. (2002). Machine learning in automated text
categorization. ACM Computing Surveys, 34(1):1–47.
Sekine, S. and Ranchhod, E. (2009). Named Entities:
Recognition, Classification and Use. John Benjamins.
Shen, D., Sun, J.-T., Yang, Q., and Chen, Z. (2006). A
comparison of implicit and explicit links for web page
classification. In Proceedings of the 15th Interna-
tional Conference on World Wide Web, WWW ’06,
pages 643–650. ACM.
Silveira, S. B. and Branco, A. (2012). Extracting multi-
document summaries with a double clustering ap-
proach. In Natural Language Processing and Infor-
mation Systems, pages 70–81. Springer.
Siolas, G. and d’Alch
´
e Buc, F. (2000). Support vector ma-
chines based on a semantic kernel for text categoriza-
tion. In IEEE-INNS-ENNS International Joint Con-
ference on Neural Networks (IJCNN 2000), volume 5,
pages 205–209. IEEE.
Smeaton, A. F. (1999). Using NLP or NLP Resources for
Information Retrieval Tasks. Natural Language Infor-
mation Retrieval. Kluwer Academic Publishers.
Trillo, R., Gracia, J., Espinoza, M., and Mena, E. (2007).
Discovering the semantics of user keywords. Journal
of Universal Computer Science, 13(12):1908–1935.
Vossen, P. (1998). EuroWordNet: A multilingual database
with lexical semantic networks. Kluwer Academic
Boston.
Wilbur, W. J. and Sirotkin, K. (1992). The automatic iden-
tification of stop words. Journal of Information Sci-
ence, 18(1):45–55.
TheGENIEProject-ASemanticPipelineforAutomaticDocumentCategorisation
171