DISTRIBUTED BLOOM FILTER FOR LOCATING XML TEXTUAL RESOURCES IN A P2P NETWORK

Clément Jamard, Laurent Yeh, Georges Gardarin

2007

Abstract

Nowadays P2P information systems are considered as large scale distributed databases where all peers can provide and query data in the network. The main challenge remains locating relevant resources. In the case of XML documents, keywords and structures must be indexed. However, the major problem for maintaining indexes of huge textual XML documents is the cost for connecting/disconnecting: indexing a quantity of keys requires the transit of many messages in the network. To reduce this cost we adapt the Bloom Filter principle to summarize peer content. Our Bloom Filter summarizes both structure and value of XML document and is used to locate resources in a P2P network. Our originality is to propose techniques to distribute the Bloom Filter by splitting it into segments using a DHT network. The system is scalable and reduce drastically the number of network messages for indexing data, maintaining the index and locating resources.

References

  1. Abiteboul, S., Manolescu, I., and Preda, N. (2005). Sharing Content in Structured P2P Networks. In BDA, pages 51-58.
  2. Bloom, B. H. (1970). Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM, 13(7):422-426.
  3. Dang-Ngoc, T.-T., Jamard, C., and Travers, N. (2005). XLive : An XML Light Integration Virtual Engine. In BDA, pages 399-404.
  4. Fan, L., Cao, P., Almeida, J. M., and Broder, A. Z. (2000). Summary Cache: a Scalable Wide-area Web Cache Sharing Protocol. ACM Trans. Netw., 8(3):281-293.
  5. Gardarin, G., Dragan, F., and Yeh, L. (2006). P2P Semantic Mediation of Web Sources. In ICEIS (1), pages 7-15.
  6. Halevy, A. Y., Ives, Z. G., Mork, P., and Tatarinov, I. (2003). Piazza: data management infrastructure for semantic web applications. In WWW, pages 556-567.
  7. Jagadish, H. V., Ooi, B. C., and Vu, Q. H. (2005). BATON: A Balanced Tree Structure for Peer-to-Peer Networks. In VLDB, pages 661-672.
  8. Koloniari, G., Petrakis, Y., and Pitoura, E. (2003). ContentBased Overlay Networks for XML Peers Based on Multi-level Bloom Filters. In DBISP2P, pages 232- 247.
  9. Ratnasamy, S., Francis, P., Handley, M., Karp, R. M., and Shenker, S. (2001). A Scalable Content-addressable Network. In SIGCOMM, pages 161-172.
  10. Rousset, M.-C., Adjiman, P., Chatalic, P., Goasdoué, F., and Simon, L. (2006). Somewhere in the semantic web. In SOFSEM, pages 84-99.
  11. Rowstron, A. and Druschel, P. (2001). Pastry: Scalable, Decentralized Object Location and Routing for LargeScale Peer-to-Peer Systems. Lecture Notes in Computer Science, 2218:329-350.
  12. Stoica, I., Morris, R., Karger, D., Kaashoek, F., and Balakrishnan, H. (2001). Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications. In Proceedings of the 2001 ACM SIGCOMM Conference, pages 149-160.
Download


Paper Citation


in Harvard Style

Jamard C., Yeh L. and Gardarin G. (2007). DISTRIBUTED BLOOM FILTER FOR LOCATING XML TEXTUAL RESOURCES IN A P2P NETWORK . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-972-8865-77-1, pages 261-266. DOI: 10.5220/0001286002610266


in Bibtex Style

@conference{webist07,
author={Clément Jamard and Laurent Yeh and Georges Gardarin},
title={DISTRIBUTED BLOOM FILTER FOR LOCATING XML TEXTUAL RESOURCES IN A P2P NETWORK},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2007},
pages={261-266},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001286002610266},
isbn={978-972-8865-77-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - DISTRIBUTED BLOOM FILTER FOR LOCATING XML TEXTUAL RESOURCES IN A P2P NETWORK
SN - 978-972-8865-77-1
AU - Jamard C.
AU - Yeh L.
AU - Gardarin G.
PY - 2007
SP - 261
EP - 266
DO - 10.5220/0001286002610266