Semantic XML Filtering on Peer-to-Peer Networks using Distributed Bloom Filters

Panagiotis Antonellis, Stavros Kontopoulos, Christos Makris, Yannis Plegas, Nikos Tsirakis

Abstract

Information filtering systems constitute a critical component in modern information seeking applications. As the number of users grows and the information available becomes even bigger it is imperative to employ scalable and efficient representation and filtering techniques. Typically the use of XML representation entails the profile representation with the use of the XPath query language and the employment of efficient heuristic techniques for constraining the complexity of the filtering mechanism. However, as the number of XML documents exchanged daily grows rapidly, the need for distributed management is becoming vital. In this paper we introduce the Distributed Bloom Filters and we propose a new distributed XML filtering system for peer-to-peer (P2P) networks. The major advantage of Distributed Bloom Filters, in comparison to the classical structure is their space efficiency and improved performance. The proposed system efficiently filters the incoming XML documents using a virtual index created on top of the network. In addition, the proposed system supports semantic disambiguation of both the stored user profiles and the XML documents, thus providing better matching results.

References

  1. Abiteboul, S., Manolescu, I., Polyzotis, N., Preda, N. and Sun, C. XML processing in DHT networks. ICDE, 2008.
  2. Antonellis, P. and Makris, C. XFIS: An XML filtering system based on string representation and matching. International Journal on Web Engineering and Technology (IJWET), 4(1), 70-94, 2008.
  3. Antonellis, P. and Makris, C. XML Filtering Using Dynamic Hierarchical Clustering of User Profiles. DEXA, 537-551, 2008.
  4. Antonellis, P., Makris, C. and Tsirakis, N. Utilizing XML Clustering for Efficient XML Data Management on P2P Networks. DEXA, 68-82, 2009.
  5. Aguilera, M. K., Strom, R.E., Stunnan, D. C., Astley, M. and Chandra, T. D. Matching events in a contentbased subscription system. PODC, 53-61, 1999.
  6. Bender, M., Michel, S., Weikum, G. and Zimmer, C. The MINERVA Project - Database Selection in the Context of P2P Search. BTW Conference, 2005.
  7. Bonomi, F., Mitzenmacher, M., Panigrahy R., Singh S. and Varghese G. An Improved Construction for Counting Bloom Filters. ESA, 684-695, 2006.
  8. Bonomi, F., Mitzenmacher M., Panigrahy R., Singh S. and Varghese G. Beyond bloom filters: from approximate membership checks to approximate state machines. SIGCOMM , 315-326, 2006.
  9. Budanitsky, A. and Hirst, G. Evaluating WordNet-based measures of lexical semantic relatedness. Association for Computational Linguistics, 32, 32-47, 2006.
  10. Jagadish, H. V, Ooi, B. C., Vu, Q. H, Zhang, R. and Zhou. A. VBI-Tree: a Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes. ICDE, 2006.
  11. Miliaraki, I. and Koubarakis, M. Distributed structural and value XML filtering. 4th ACM International Conference on Distributed Event-Based Systems, 2- 13, 2010.
  12. Miller, G. A., Beckwith, R., Fellbaum, C. D., Gross, D. and Miller. K. WordNet: An online lexical database. Int. J. Lexicograph, 3(4), 235-244, 1990.
  13. Ning, B. and Liu, C. XM filtering with XPath expressions containing parent and ancestor axes. Information Sciences, Elsevier, 210 (Nov. 2010), 41-54, 2010.
  14. Patwardhan, S., Banerjee, S., and Pedersen, T. Using measures of semantic relatedness for word sense disambiguation. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics (CICLing). 241-257, 2003.
  15. Podnar, I., Rajman, M., Luu, T., Klemm, F. and Aberer, K. Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys. ICDE, 2007.
  16. Tagarelli, A. and Greco, S. Semantic clustering of xml documents. ACM Transactions on Information Systems, 28 (1), 1-56, 2010.
  17. Tagarelli, A., Longo, M. and Greco S. Word Sense Disambiguation for XML Structure Feature Generation. In Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications, 2009.
Download


Paper Citation


in Harvard Style

Antonellis P., Kontopoulos S., Makris C., Plegas Y. and Tsirakis N. (2013). Semantic XML Filtering on Peer-to-Peer Networks using Distributed Bloom Filters . In Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8565-54-9, pages 133-136. DOI: 10.5220/0004363301330136


in Bibtex Style

@conference{webist13,
author={Panagiotis Antonellis and Stavros Kontopoulos and Christos Makris and Yannis Plegas and Nikos Tsirakis},
title={Semantic XML Filtering on Peer-to-Peer Networks using Distributed Bloom Filters},
booktitle={Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2013},
pages={133-136},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004363301330136},
isbn={978-989-8565-54-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Semantic XML Filtering on Peer-to-Peer Networks using Distributed Bloom Filters
SN - 978-989-8565-54-9
AU - Antonellis P.
AU - Kontopoulos S.
AU - Makris C.
AU - Plegas Y.
AU - Tsirakis N.
PY - 2013
SP - 133
EP - 136
DO - 10.5220/0004363301330136