differ only in some details, e.g., a threshold or some
additional constraints. Thus, our implementation
is based on modern software engineering methods,
namely Aspect-oriented Programming (AOP) (Kicza-
les et al., 1997) and mixin layers (Smaragdakis and
Batory, 2002). Furthermore, we discuss their advan-
tages and disadvantages in terms of reusability, con-
figurability and extensibility especially for the do-
main of DHTs. We evaluate our protocol on top of
a CAN implementation by experiments. The exper-
iments exemplarily compare different dissemination
strategies. Our results verify that the software engi-
neering approach allows to build variants of the pro-
tocol which meet the requirements of many different
applications.
The remainder of this paper has the following struc-
ture: After investigating in the characteristics of meta-
data in Section 2, we describe our piggyback meta-
data propagation protocol in Section 3. Section 4
presents our implementation based on mixin layers
and Aspect-oriented Programming, and Section 5 fea-
tures an experimental evaluation. This is followed by
a conclusion in Section 6.
2 CHARACTERISTICS OF
META-DATA IN DHTs
DHTs provide the data management operations of a
hashtable, in particular put and get. All operations
are addressed by a certain key. Each peer is responsi-
ble for a zone of the key space, and knows some other
peers. Variants of DHTs primarily differ in the topol-
ogy of the key space and the contact selection. Each
peer forwards any operation it cannot perform on its
own zone to a contact whose zone is closer to the key.
The way a node determines a closer peer depends on
the topology of the key space. DHTs similar to Chord
(Stoica et al., 2001) or CAN (Ratnasamy et al., 2001a)
organize as a circular key space in one or multiple di-
mensions, which is distributedamong all nodes. Here,
the peer forwards a message depending on the Euclid-
ean distance of the zones to the key of the operation.
In contrast, DHTs based on search trees like P-Grid
(Aberer, 2001) map each node to a certain subtree-
ID, and query forwarding follows subtrees with cor-
responding ID-prefixes. Usually, all DHT operations
are performed by invoking log(n) peers.
The absence of global knowledge makes it chal-
lenging to design DHT applications. There are two
straightforward ways to bypass the problem: (1) set-
ting up all necessary information at startup-time, and
(2) disseminating information by flooding the DHT.
But while (1) limits the range of applications, (2)
limits the performance of the DHT. Before present-
ing our dissemination scheme, we want to investigate
common characteristics
1
of meta-data items used in
DHTs. Meta-data are used on three levels:
Meta-data of the DHT itself. DHTs forward mes-
sages between peers responsible for certain zones.
Therefore each peer needs to know a proper set of
contacts that changes whenever peers join or leave.
Thus, status information regarding zone boundaries
and contact information have to be updated. The
source of information is the new peer or the peer that
takes over the zone of a retiring peer. The addressee
can be neighbors of that peer (CAN, Chord), nodes
which follow a certain distribution (cf. (Kleinberg,
2000)) or other peers (Chord, P-Grid). Status updates
need to reach the addressee in time, but they are in-
frequent and irregularly generated.
Data related to enhancements of the DHT. There
are many different enhancements to the standard
DHTs which require different meta-data. These in-
formation are not needed in time, and have various
creators and receivers. For example, load balancing
(Rao et al., 2003) may take place in DHTs by forward-
ing messages along idle paths, reorganizing the key
space or changing the number of replicas of under-
or overloaded zones. Here, each peer requires infor-
mation about the average load (which may be rather
old) and more recent data about the load of all of its
contacts. Load information are disseminated at steady
rates.
Application-specific meta-data. The application
on top of the DHT may use the meta-data distribu-
tion scheme as well. For example, in a distributed
web crawler application the peers notify others about
the partitions of the WWW which they have already
crawled. A further example are keep-alive statements
in distributed groupware scenarios.
Summing up, nearly any application comes with
different requirements for a meta-data dissemination
protocol. The differences concern the origin of the
data item, the addressee and the frequency the data
items are generated. Some applications require meta-
data in time while others do not. This leads to two
important insights: (1) There is a strong demand for
a cheap protocol that satisfies the needs of many dif-
ferent applications, and works at smaller costs than
the traditional flooding protocols. With cost we refer
to the amount of network traffic, execution time and
memory consumption. (2) The number of required
protocol variants is very large. However, the variants
1
Each application comes with its very own set of re-
quirements. Thus we cannot come up with a comprehensive
list of features.
PIGGYBACK META-DATA PROPAGATION IN DISTRIBUTED HASH TABLES
73