tral server
2
but are stored on users’ devices. The dis-
tribution of computations and contents matches the
constraints of scalability and reactivity.
In this paper, we will first present the related work
on collaborativefiltering approaches. We will then in-
troduce our Peer-to-Peer user-centered model which
offers the advantage of being fully distributed. We
called this model ”Adaptive User-centered Recom-
mender Algorithm” (AURA). It provides a service
which builds a virtual community of interests cen-
tered on the active user by selecting his/her near-
est neighbors. As the model is ego-centered, the
active user can define the expected prediction qual-
ity by specifying the minimum-correlation threshold.
AURA is an anytime algorithm which furthermore re-
quires very few computation time and memory space.
As we want to constantly improve our model and the
document sharing platform, we are incrementally and
modularly developing them on a JXTA platform
3
.
2 STATE-OF-THE-ART
In centralized collaborative filtering approaches, find-
ing the closest neighbors among several thousands of
candidates in real time may be unrealistic (Sarwar
et al., 2001). On the contrary, decentralization of data
is practical to comply with privacy rules, as long as
anonymity is fulfilled (Canny, 2002). This is the rea-
son why more and more researchers investigate var-
ious means of distributing collaborative filtering al-
gorithms. This also presents the advantage of giving
the ownership of profiles to users, so that they can
be re-used in several applications.
4
We can mention
research on P2P architectures, multi-agents systems
and decentralized models (client/server, shared data-
bases).
There are several ways to classify collaborativefil-
tering algorithms. In (Breese et al., 1998), authors
have identified, among existing techniques, two ma-
jor classes of algorithms: memory-based and model-
based algorithms. Memory-based techniques offer
the advantage of being very reactive, by immedi-
ately integrating modifications of users profiles into
the system. They also guarantee the quality of rec-
ommendations. However, Breese et al. (Breese et al.,
1998) are unanimous in thinking that their scalabil-
ity is problematic: even if these methods work well
2
This allows to have document IDs and to identify them
easily.
3
http://www.jxta.org/
4
As the owner of the profile, the user can apply it to dif-
ferent pieces of software. In centralized approaches, there
must be as many profiles as services for one user.
with small-sized examples, it is difficult to change to
situations characterized by a great number of docu-
ments or users. Indeed, time and space complexities
of algorithms are serious considerations for big data-
bases. According to Pennock et al. (Pennock et al.,
2000), model-based algorithms constitute an alterna-
tive to the problem of combinatorial complexity. Fur-
thermore, these models highlight some correlations in
data, thus proposing an intuitive reason for recom-
mendations or simply making the hypotheses more
explicit. However, these methods are not dynamic
enough and they react badly to insertion of new con-
tents into the database. Moreover, they require a pe-
nalizing learning phase for the user.
Another way to classify collaborative filtering
techniques is to consider user-based methods in op-
position to item-based algorithms. For example,
we have explored a distributed user-based approach
within a client/server context in (Castagnos and
Boyer, 2006). In this model, implicit criteria are used
to generate explicit ratings. These votes are anony-
mously sent to the server. An offline clustering al-
gorithm is then applied and group profiles are sent to
clients. The identification phase is done on the client
side in order to cope with privacy. This model also
deals with sparsity and scalability. The authors high-
light the added value of a user-based approach in the
situation where users are relativelystable, whereas the
set of items may often vary considerably. On the con-
trary, Miller et al.(Miller et al., 2004) show the great
potential of distributed item-based algorithms. They
propose a P2P version of the item-item algorithm.
In this way, they address the problems of portability
(even on mobile devices), privacy and security with
a high quality of recommendations. Their model can
adapt to different P2P configurations.
Beyond the different possible implementations,
we can see there are a lot of open questions
raised by industrial use of collaborative filtering.
Canny (Canny, 2002) concentrates on ways to provide
powerful privacy protection by computing a ”pub-
lic” aggregate for each community without disclos-
ing individual users’ data. Furthermore, his approach
is based on homomorphic encryption to protect per-
sonal data and on a probabilistic factor analysis model
which handles missing data without requiring default
values for them. Privacy protection is provided by
a P2P protocol. Berkovsky et al. (Berkovsky et al.,
2006) also deal with privacy concern in P2P recom-
mender systems. They address the problem by elect-
ing super-peers whose role is to compute an average
profile of a sub-population. Standard peers have to
contact all these super-peers and to exploit these aver-
age profiles to compute predictions. In this way, they
WEBIST 2007 - International Conference on Web Information Systems and Technologies
52