Classification and Indexing of Web Content Based on a Model of Semantic Social Bookmarking

Antonello Angius, Giulio Concas, Dino Manca, Filippo Eros Pani, Georgia Sanna

2014

Abstract

One of the key challenges in Information Technology is finding a way to organize the knowledge present on the Web. This led to years of research on the integration of information, on the Semantic Web and related technologies. Information Search and Retrieval from the Web occur through a process of content disambiguation and search engines use algorithms and software agents in order to meet the needs of users and advertising buyers. Ex-post analytical agents and tools are becoming more pervasive, so much to cause increasing problems of privacy. Our work proposes an innovative approach of content disambiguation that overturns the ex-post semantic analysis of contents, because it deals with an ex-ante classification conducted on two axes: vertical one (hierarchical and taxonomic axis) and horizontal one (folksonomic axis through tags or keywords). This method, which is based on the logic of social bookmarking and focuses on semantic tagging, represents a new frontier in information architecture because it introduces a new way of classification made by people using keywords that have a specific lexical and semantic value. This approach will allow people to create a knowledge base of Web contents characterized by a precise semantic definition.

References

  1. Moore, R., Lopes, J., 1999. Paper templates. In TEMPLATE'06, 1st International Conference on Template Production. SciTePress.
  2. Smith, J., 1998. The book, The publishing company. London, 2nd edition.
  3. Adar, E., Skinner, M., Weld, D. S., 2009. Information arbitrage across multi-lingual Wikipedia. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, Barcelona, Spain, pp. 94-103.
  4. Agirre, E., Soroa, A., 2009. Personalizing PageRank for Word Sense Disambiguation: In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece, pp. 33-41.
  5. Berners-Lee, T., Hendler, J., Lassila, O., 2001. The Semantic Web. In: Scientific American, pp. 29-37.
  6. Brickley, D., Guha, R. V., 1999. Resource Description Framework (RDF) Schema Specification. Proposed Recommendation, World Wide Web Consortium. http://www.w3.org/TR/PR-rdf-schema
  7. Concas, G., Lisci, M., Pinna, S., Porruvecchio, G., Uras, S., 2008. Open Source Communities as Social Networks: an analysis of some peculiar characteristics. In: 19th Australian Conference on Software Engineering, ASWEC 2008, pp. 387-391.
  8. ERC Starting Grant MultiJEDI, Università di Roma “La Sapienza”. http://lcl.uniroma1.it/multijedi
  9. Fellbaum, C. (Ed.), 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.
  10. Garcia, A., Szomszor, M., Alani, H., Corcho, O., 2009. Preliminary Results in Tag Disambiguation using DBpedia. In: Knowledge Capture (K-Cap'09), First International Workshop on Collective Knowledge Capturing and Representation, CKCaR 7809, Redondo Beach, California, USA.
  11. Huynh, D., Mazzocchi, S., Karger, D., 2005. Piggy bank: experience the semantic web inside your web browser. In: Proceedings of the 4th international conference on The Semantic Web, Galway, Ireland.
  12. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., Hellmann, S., 2014. DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. In: Semantic Web, IOS Press, ISSN: 2210-4968.
  13. Lunesu, M. I., Pani, F. E., Concas, G., 2011. An Approach to manage semantic informations from UGC. In: 3th International Conference on Knowledge Engineering and Ontology Development, KEOD 2011, Paris, France.
  14. Lunesu, M. I., Pani, F. E., Concas, G., 2011. Using a standards-based approach for a multimedia knowledge-base. In: 3th International Conference on Knowledge Management and Information Sharing, KMIS 2011, Paris, France.
  15. Miller, G. A., 1995. WordNet: A Lexical Database for English. In: Communications of the ACM, Vol. 38, No. 11, pp. 39-41.
  16. Morsey, M., Lehmann, J., Auer, S., Stadler, C., Hellmann, S., 2012. DBpedia and the live extraction of structured data from Wikipedia. In: Program: electronic library and information systems, Vol. 46 Issue 2, pp.157-181, ISSN: 0033-0337.
  17. Murgia, A., Concas, G., Marchesi, M., Tonelli, R., 2010. A machine learning approach for text categorization of fixing-issue commits on CVS. In: Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement.
  18. Navigli, R., Ponzetto, S. P., 2010. BabelNet: building a very large multilingual semantic network. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 216-225, Stroudsburg, PA, USA.
  19. Navigli, R., Ponzetto, S. P., 2012. BabelNet: the automatic construction, evaluation and application of a widecoverage multilingual semantic network. In: Journal Artificial Intelligence, Elsevier Science Publishers Ltd. Essex, UK, Vol. 193, pp. 217-250.
  20. Pani, F. E., Lunesu M. I., Concas, G., Stara, C., Tilocca, M. P., 2012. Knowledge Formalization and Management in KMS. In: 4th International Conferenze on Knowledge Management and Information Sharing, KMIS 2012, Barcelona, Spain.
  21. Pani, F. E., Lunesu, M. I., Concas, G., Baralla, G., 2013. An Approach to Manage the Web Knowlledge. In: 5th International Conference on Knowledge Engineering and Ontology Development, KEOD 2013, Algarve, Portugal.
  22. Passant, A., Laublet, P., 2008. Meaning Of A Tag: a collaborative approach to bridge the gap between tagging and Linked Data. In: Proceedings of the Linked Data on the Web Workshop (LDOW2008) at the 17th International Semantic Web Conferences (ISWC 2008). Karlsruhe, Germany, ISSN: 1613-0073.
  23. Spyns, P., de Moor, A., Vandenbussche, J., Meersman, R., 2006. From folksologies to ontologies: how the twain meet. In: Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE, Vol. 1, pp. 738-755, Springer-Verlag Berlin, Heidelberg. ISSN: 3-540-48287-3 978-3-540- 48287-1
  24. Suchanek, F. M., Kasneci, G., Weikum, G., 2008. YAGO: a large ontology from Wikipedia and WordNet. In: Journal of Web Semantics, Vol. 6, pp. 203-217.
  25. Toral, A., Ferrández, O., Agirre, E., Muñoz, R., 2009. A study on linking Wikipedia categories to WordNet synsets using text similarity. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing, Borovets, Bulgaria, pp. 449-454 .
  26. Wu, F., Weld, D., 2007. Automatically semantifying Wikipedia. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Managemen t, Lisbon, Portugal, pp. 41-50.
  27. Zhong, Z., Ng, H. T., Chan, Y.S., 2008. Word Sense Disambiguation using OntoNotes: An empirical study. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Waikiki, Honolulu, Hawaii, pp. 1002-1010.
Download


Paper Citation


in Harvard Style

Angius A., Concas G., Manca D., Eros Pani F. and Sanna G. (2014). Classification and Indexing of Web Content Based on a Model of Semantic Social Bookmarking . In Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014) ISBN 978-989-758-050-5, pages 313-318. DOI: 10.5220/0005139303130318


in Bibtex Style

@conference{kmis14,
author={Antonello Angius and Giulio Concas and Dino Manca and Filippo Eros Pani and Georgia Sanna},
title={Classification and Indexing of Web Content Based on a Model of Semantic Social Bookmarking},
booktitle={Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014)},
year={2014},
pages={313-318},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005139303130318},
isbn={978-989-758-050-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014)
TI - Classification and Indexing of Web Content Based on a Model of Semantic Social Bookmarking
SN - 978-989-758-050-5
AU - Angius A.
AU - Concas G.
AU - Manca D.
AU - Eros Pani F.
AU - Sanna G.
PY - 2014
SP - 313
EP - 318
DO - 10.5220/0005139303130318