locations in centralised (e.g., Napster), hybrid (e.g.,
Kazaa) and decentralised (e.g., Chord). Centralised
P2P networks are subject to the same drawbacks for
which the traditional server-client model was origi-
nally abandoned (network failures due to central peer
failure, impaired scalability, joining/leaving of peers
not easily handled, possible undesirable dominion of
controllers). For these reasons, we focus on decen-
tralised unstructured P2P networks, which overcome
the aforementioned drawbacks. The absence of struc-
ture was selected for the looseness of control over the
data location, that is each peer can share its own doc-
uments without hosting any documents of other peers
due to locality restrains.
2
Music representation can primarily be separated in
two classes: the symbolic representation (MIDI for-
mat) and the acoustic representation (audio format -
wav, mp3). The focus of this work is on acoustic
data, thus a musical piece can be considered as a time
series of signal intensity over time. To measure the
similarity of two musical pieces, we utilise the Dy-
namic Time Warping (DTW) method. The main flex-
ibility of the DTW method is its capability to with-
stand distortion of the comparing series in the time
axis. Accordingly, it allows for two locally out of
phase time series that are nevertheless similar to align
in a non-linear manner. Since different performances
of the same musical piece may include locally differ-
entiated tempo, DTW seems a natural choice for this
problem (Large and Palmer, 2002). For this reason, it
has been recently proposed for shake of MIR in cen-
tralised environments (Zhu and Shasha, 2003; Maz-
zoni and Danneberg, 2001; Jang et al., 2001; Adams
et al., 2004).
In this paper, we focus on the problem of search-
ing, based on DTW, for similar acoustic data over un-
structured decentralised P2P networks. The technical
contributions of this paper are summarised as follows:
• The development of a novel algorithm that effi-
ciently retrieves audio data similar to an audio
query in an decentralised unstructured P2P net-
work.
• The proposed algorithm takes advantage of the ab-
sence of overhead in unstructured P2P networks
and efficiently minimises the required traffic for all
operations with the use of an intelligent sampling
scheme on the lower and upper bounds used. The
proposed algorithm has such a design that no false
negative results occur.
2
We must notice that with the examined framework we
refer to applications that support content sharing for legal
subscribers (e.g., iTunes). Moreover, it is interesting to no-
tice that the proposed approach can be adopted as a means
of identification of illegal sharing, by finding sites that share
unregistered content.
• The detailed experimental results which show the
efficiency of the proposed algorithm, and the per-
formance gains compared to an existing baseline
algorithm.
The rest of the paper is organised as follows. Sec-
tion 2 describes related work. Section 3 provides a
complete account of the algorithm proposed in this
paper. Subsequently, Section 4 presents and discusses
the experimentation and results obtained. Finally, the
paper is concluded in Section 5.
2 BACKGROUND AND RELATED
WORK
2.1 Searching methods in
unstructured P2P networks
In this section we summarise a number of different
searching methods for decentralised unstructured P2P
networks. Initially, we examine the Breadth-First
Search (BFS) algorithm. In the BFS, a query peer
Q propagates the query q to all its neighbor peers.
Each peer P receiving the q initially searches its lo-
cal repository for any documents matching q and then
passes on q to all its neighbors. In case a P has
a match in its local repository then a QueryMatch
message is created containing information about the
match. The QueryMatch messages are then transmit-
ted back, using reversely the path q travelled, to Q.
Finally, since more than one QueryMatch messages
have been received by Q, it can select the peer with
best connectivity attributes for direct downloading of
the match. It is obvious that the BFS sacrifices perfor-
mance and network traffic for simplicity and high-hit
rates. In order to reduce network traffic, the TTL para-
meter is used (see Section 2). In a modified version of
this algorithm, the Random BFS (RBFS) (Kalogeraki
et al., 2002), the query peer Q propagates the query q
not to all but at a fraction of its neighbor peers.
In an attempt to rectify the inability of the RBFS to
select a path of the network leading to large network
segments, the >RES algorithm was developed (Yang
and Garcia-Molina, 2002). In this approach, a node Q
propagates the q to k neighboring peers, all of which
returned the most results during the last m queries,
with k and m being configurable parameters. >RES
can be characterised as quantitative than qualitative,
since it does not consider the content of the query.
With ISM (Kalogeraki et al., 2002), on the other
hand, for each query, a peer propagates the query q to
the peers that are more likely to reply the query based
on the following two parameters; a profile mechanism
and a relevance rank. The profile is is built and main-
tained by each peer for each of its neighboring peers.
The information included in this profile consists of the
MUSICAL RETRIEVAL IN P2P NETWORKS UNDER THE WARPING DISTANCE
101