Authors:
Clément Jamard
;
Laurent Yeh
and
Georges Gardarin
Affiliation:
PRiSM Laboratory, University of Versailles, France
Keyword(s):
XML, XQuery Text, P2P Network, Database System, Bloom Filter, Indexation.
Related
Ontology
Subjects/Areas/Topics:
Databases and Datawarehouses
;
Distributed and Parallel Applications
;
Internet Technology
;
Network Systems, Proxies and Servers
;
Web Information Systems and Technologies
;
XML and Data Management
Abstract:
Nowadays P2P information systems are considered as large scale distributed databases where all peers can provide and query data in the network. The main challenge remains locating relevant resources. In the case of XML documents, keywords and structures must be indexed. However, the major problem for maintaining indexes of huge textual XML documents is the cost for connecting/disconnecting: indexing a quantity of keys requires the transit of many messages in the network. To reduce this cost we adapt the Bloom Filter principle to summarize peer content. Our Bloom Filter summarizes both structure and value of XML document and is used to locate resources in a P2P network. Our originality is to propose techniques to distribute the Bloom Filter by splitting it into segments using a DHT network. The system is scalable and reduce drastically the number of network messages for indexing data, maintaining the index and locating resources.