GRSK:
A GENERALIST RECOMMENDER SYSTEM
I. Garcia, L. Sebastia, S. Pajares and E. Onaindia
Dpt. Computer Science, Technical University of Valencia, Camino de Vera s/n, Valencia, Spain
Keywords:
Generalist recommender systems, Hybrid recommender systems, Tourism, Movies.
Abstract:
This paper describes the main characteristics of GRSK, a Generalist Recommender System Kernel. It is a
RS based on the semantic description of the domain, which allows the system to work with any domain as
long as the data of this domain can be defined through an ontology representation. GRSK uses several Basic
Recommendation and Hybrid Techniques to obtain the recommended items. Through the GRSK configuration
process, it is possible to select which techniques to use and to parameterize different aspects of the recommen-
dation process, in order to adjust the GRSK behavior to the particular application domain. The experimental
results will show that GRSK can be successfully used with different domains.
1 INTRODUCTION
Every day, new data appears on the Web. Everyone
browsing on the Internet can have the perception of
the huge amount of information available, which can
lead to a situation of information overload, that is the
situation where there is far too much information at
people disposal so that useful information could be
hidden by other data. In this case, techniques to re-
trieve useful information become more and more im-
portant. The usefulness of information depends on the
users and their objectives, so retrieval systems have to
try to understand the purpose of a user search in order
to propose information he could be interested in. A
special kind of information retrieval techniques that
focuses on this issue is named information filtering.
As the name suggests, starting from a big set of infor-
mation, this technique identifies a small subset which
should include the useful/interesting information.
Recommendation systems are a specific type
of information filtering technique that attempts to
present information items (e.g. movies, songs, activi-
ties, etc.) that are likely of interest to the user. A rec-
ommender system (RS) (Resnick P., Varian H., 1997)
is personalization tool that attempts to provide people
with lists of information items that best fit their indi-
vidual tastes. A RS infers the user’s preferences by
analyzing the available user data, information about
other users and information about the environment.
RS are used to either predict whether a given user
will like a particular item or identify the top N items
that will be of interest to the user. In RS, how much
a particular user likes an item is represented by a rat-
ing. Basically, a RS estimates ratings for the items
that have not been seen by a user and recommends to
the user the items with the highest estimated ratings.
Being an instance of information filtering, recom-
mendation systems can be based on the demographic
filtering algorithm, the content-based filtering algo-
rithm or the collaborative filtering algorithm
1
. All
these approaches have advantages and disadvantages
(Adomavicius G., Tuzhilin A., 2005); a common so-
lution adopted by many RS is to combine these tech-
niques into an hybrid RS (Pazzani M.J., 1999; Burke
R., 2007) thus improving recommendations by allevi-
ating the limitations of one technique with the advan-
tages of others.
Recently, some researchers have been focusing
on enhancing recommendations by exploiting a se-
mantic description of the domain in which recom-
mendations are provided (van Setten M., Reitsma J.,
Ebben P., 2006),(Tao Li, Anand S.S., 2009). In gen-
eral, items handled by the system are semantically de-
scribed through an ontology. Then, the recommenda-
tions are based on the semantic matching between the
user profiles and the item descriptions. The main dis-
advantage of these approaches is that a semantic rep-
resentation of the domain has to be available and, up
to now, user profiles and items are described manu-
ally.
1
These
algorithms will be detailed later on.
211
Garcia I., Sebastia L., Pajares S. and Onaindia E.
GRSK: A GENERALIST RECOMMENDER SYSTEM.
DOI: 10.5220/0002779302110218
In Proceedings of the 6th International Conference on Web Information Systems and Technology (WEBIST 2010), page
ISBN: 978-989-674-025-2
Copyright
c
2010 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
This paper summarizes the main characteristics of
the Generalist Recommender System Kernel (GRSK).
It is a RS based in a semantic description of the do-
main that uses a hybrid recommendation technique,
fed by the recommendations obtained from different
algorithms. The task of GRSK is to generate the
list of the top N items that will be of interest to the
user. GRSK can be parameterized to adjust the sys-
tem working model, i.e. to use the desired recom-
mendation techniques. Besides, it is prepared to in-
clude as many techniques as desired by simply devel-
oping new modules. On the other hand, it is a domain-
independent engine, able to work with different cata-
logs of items to recommend.
This paper is organized as follows. Section 2 gives
an overview on the GRSK architecture, the informa-
tion GRSK needs (ontology and user information)
and, finally, the GRSK recommendation process. Sec-
tion 3 explains the process of GRSK configuration to
be integrated into a system. Section 4 presents the re-
sults we have obtained when working with a tourism
domain and with a movies domain. We finish with
some conclusions and future work.
2 GRSK: GENERALIST
RECOMMENDER SYSTEM
KERNEL
2.1 GRSK Ontology
The GRSK behaviour relies on the use of a ontology
to describe the user’s preferences and the items to rec-
ommend. It has been designed to be generalist, so
GRSK is able to work with any application domain
as long as the data of the new domain can be defined
through an ontology representation.
An ontology is a formal representation of a set of
concepts within a domain and the relationships be-
tween those concepts. The GRSK ontology contains
the features that describe the items in the domain. For
example, in the tourism domain, the ontology is com-
posed of terms describing architectonic styles or types
of buildings. Figure 1 shows an example of this on-
tology. In the movies domain (figure 2), the feature
denote the film genres. It is important to remark that
GRSK is able to work from simple ontologies (such as
the movies ontology, which is basically a list of gen-
res) to more complex ontologies (with several levels
of refinements, for example).
The items in the domain are described by the
features of the ontology. Moreover, each pair item-
feature is associated a value to indicate the degree
T
Science
Park
Museum
Open
spaces
Thematic
park
Spectacles
Science
Children
Visit to
L’Oceanografic
Visit to
L’Hemisferic
Zoo -
Aquarium
90
80
70
Visit to
P. Felipe M.
90
Visit to
Bioparc Zoo
70
Visit to
Turia Garden
70
50
Visit to
Cabecera Garden
70
60
Figure 1: Part of the e-Tourism ontology.
M
Drama
Romance
Comedy
40
Cinema Paradiso
Action
Thriller
Chairman of the
Board
Titanic
The firm
40
40
80
40
40
40
60
80
Figure 2: Part of the e-Movies ontology.
of interest of the item under the feature, i.e. as a
member of the category denoted by the feature. An
item can also be categorized by more than one feature
in the ontology. Formally, an item i is described by
means of a list of tuples of the form (i, f ,d
if
), where
f is a feature defined in the ontology and d
i f
[0, 100]
is the degree of interest of the item i under the feature
f. Additionally, items are associated a numeric value
AC
i
(acceptance counter) to represent how popular
the item i is among users; this value indicates how
many times this item has been accepted when recom-
mended.
2.2 User Information
In order to compute a recommendation, GRSK
records a profile of each user, which models the user
tastes and preferences as well as his historical inter-
action with the system. The profile of a given user u
records, in first place, personal and demographic de-
tails about the user like the age, the gender, the family
or the country. Second, the user profile also contains
the user general-likes model, denoted by GL
u
, which
is a list of the features f in the ontology the user is in-
terested in along with the user ratings r
u f
for those
features: GL
u
= {(u, f, r
u f
)}, where r
u f
[0,100].
A user profile in GRSK also contains information
about the historical interaction of the user with the
RS, namely the set of items i the user has been rec-
ommended and his degree of satisfaction r
ui
with the
recommended items: RT
u
= {(u, i,r
ui
}, where r
ui
[0,100].
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
212
Basic Recommendation
Techniques (BRT)
Collaborative
Hybrid Techniques
Manager (HTM)
General likes
Content-based
Engine
Recommendation
query
Final
recommendations
Rated
recommendations
Recommendations
Ranked N
Recommendations
Mixed Hybrid RS
User
Profile
User
Preferences
Pd, Pcb, Pcol, Pgl
Weighted Hybrid
RS
Items Selector
Preferences
Pd, Pcb, Pcol, Pgl
Recommendations
Demographic
Figure 3: GRSK Architecture.
2.3 The GRSK Architecture
Figure 3 shows an sketch of the GRSK architecture.
The Engine module is the core of GRSK. The first
task of the Engine is to capture and store the user pro-
file when the user logs in the system for the first time.
Then, the information obtained during the interaction
of the user with the system after the recommendation
(rated recommendations) will be used to update his
profile to better capture his preferences.
The Engine is also in charge of controlling the rec-
ommendation process, which consists of two steps:
first, each basic recommendation technique calculates
a set of preferences for the user profile; and then, the
items selector obtains the items that match the user
preferences which are combined by the hybrid tech-
nique to obtain the final list of recommended items.
The modules used by the engine to obtain the recom-
mendation are:
Basic Recommendation Techniques (BRT)
(Burke R., 2007) (demographic RS, content-based
RS, collaborative RS and general likes-based
filtering) are used to obtain the user preferences
by analyzing his own profile, the profiles of other
users and the items selected by the users that
have utilized the system before. For a given user,
each BRT creates a different list of preferences
according to the parameters and data handled
by the technique. The system configuration
allows to select the set of BRT to use in the
recommendation process (see section 3).
Items Selector: receives the lists of user prefer-
ences and, for each list, it returns the set of items
that better match the elements in the list.
The Hybrid Techniques Manager (HTM) com-
bines the lists of items in a single list, that con-
form the final user recommendation list. The hy-
brid techniques are applied on items, not on pref-
erences. At this moment, GRSK includes two hy-
brid recommendation methods: the mixed hybrid
technique and the weighted hybrid technique. The
system configuration allows to select only one hy-
brid technique to use in the recommendation pro-
cess.
At this moment, GRSK includes several BRT and
two hybrid techniques, but it is prepared to work with
as many techniques as desired by simply developing
new modules. We opted for these techniques because
we considered them more suitable for the most com-
mon domains.
2.4 GRSK Recommendation Process
The recommendation process in GRSK is divided in
two steps. The first one is to obtain the preferences
that define the items that will be of interest to the user
(section 2.4.1). The user introduces his query, which
is sent joint with his profile to the BRT to produce a
list of individual preferences for each technique. The
second step is to obtain the list of items to recommend
(section 2.4.2). This second step includes to obtain
the list of items that match the preferences and to ap-
ply an hybrid recommendation technique to obtain the
final ranked list of recommended items.
2.4.1 Modeling of User Preferences
This step consists of analyzing the user profile and
eliciting the corresponding list of preferences. It is
important to note that, unlike most RS, GRSK is a se-
mantic RS that does not initially work with the items
that will be later recommended to the user. In con-
trast, GRSK makes use of the concept of feature to
elicit the user preference model, which is a more gen-
eral and flexible entity. This makes GRSK able to
work with any application domain as long as the data
can be represented through an ontology.
A preference (which is a tuple of the form
(u, f, d
uf
)) is a feature f in the ontology with a
interest-degree of d
u f
for a user u, selected by one
of the four basic recommendation techniques: de-
mographic recommendation, content-based recom-
mendation, collaborative recommendation and gen-
eral likes-based filtering. Each BRT generates a dif-
ferent set of preferences, an independent list of pref-
erences and hence the lists may contain different fea-
tures or the same feature with different degrees of in-
terest. We will call these lists P
u
d
for the demographic
preference list, P
u
cb
for the content-based preference
list, P
u
col
for the collaborative preference list, and P
u
gl
for the general-likes-based preference list.
GRSK: A GENERALIST RECOMMENDER SYSTEM
213
The demographic BRT classifies the user into
a demographic category according to his profile de-
tails. Each demographic category is associated a
list of preferences (P
u
d
) during the system configura-
tion because they depend on the application domain.
The success of the demographic recommendation is
strongly dependant of this user classification. We
opted for a demographic BRT because it is a good
alternative to solve the problem of the new user since
it is able to always give a recommendation.
The content-based RS technique computes a set
of preferences by taking into account the items previ-
ously rated by the user (historical interaction). This
technique will allow us to increase the user satisfac-
tion by recommending items similar to those already
accepted by the user. Let f be a feature and I a list of
items described under the feature f in the ontology; I
will be a list of tuples of the form (i, f ,d
if
) for a par-
ticular feature f . Let RT
u
= {(u, i,r
ui
)} be the set of
items valued by user u with respective ratings of r
ui
;
a preference (u, f, d
u f
) is added to the list P
u
cb
where:
d
uf
=
iIRT
u
d
i f
r
ui
|RT
u
|
The value d
uf
denotes the interest-degree of a user
for the items described under the feature f amongst
the whole set of items rated by the user.
The collaborative RS technique suggests those
items preferred by people with a profile most simi-
lar to the given user profile (i.e. the user will be rec-
ommended items that people with similar tastes and
preferences liked in the past). This technique is only
useful when there is a great amount of data concern-
ing items rated by other users. In order to obtain the
corresponding list of preferences P
u
col
, this technique
decides whether a user v is similar to the given user
u (s
u,v
) by applying the Pearson Correlation with re-
spect to the items that have been rated by both users.
Then, by taking into account all the users v similar to
u, a preference (u, f, d
uf
) is added to P
u
col
for each f
that describes an item i rated by v, where:
d
u f
= avg(d
i f
r
vi
),v : s
u,v
The general-likes-based filtering is an informa-
tion filtering technique that obtains the preferences
that match with the main user interests specified by
the user in his profile (GL
u
). The accuracy of this
technique depends on the information provided by the
user. However, GRSK is able to work with few infor-
mation. In this case, the set of preferences P
u
gl
is sim-
ply built as P
u
gl
= GL
u
; that is, the interest-degree of
the preferences in P
u
gl
will be the ratings given by the
user to that particular feature in his profile (d
f
= r
f
).
2.4.2 Obtention of the List of Recommended
Items
In the second step of the recommendation process,
the Items Selector selects, among all of the items
in the domain, those that best match the preferences
in the lists P
u
d
, P
u
cb
, P
u
col
and P
u
gl
. Afterwards, the
selected Hybrid Technique obtains a single list of
ranked recommendations that we will denote as RI
u
=
{(u,i,d
ui
)}, where i is the item, and d
ui
is the esti-
mated interest-degree of the item i for the user u.
The method for selecting an item is quite simple:
an item i represented by the tuple (i, f, d
i f
) matches
a preference in P
u
brt
if there is a tuple (u, f,d
u f
) in
P
u
brt
such that the item has not previously rated by the
user. The outcome of the Items Selector is a set of
lists of ranked items, one list per BRT. The lists of
recommended items computed by the Items Selector
are then processed by the selected Hybrid Technique
and returns a single list of ranked items (RI
u
). The
value d
ui
of a tuple in RI
u
depends on the selected Hy-
brid Technique. At this moment, GRSK includes two
hybrid techniques: mixed and weighted techniques.
The Mixed Hybrid Technique mixes the items in
the lists of all the BRT. All items are handled in the
same way with independency the BRT they belong to.
In this case, the value d
ui
of a tuple in RI
u
is calculated
as follows:
d
ui
= percentile(AC
i
) + avg
f
(d
i f
+ d
u f
)
where percentile(AC
i
) refers to the percentile of
the acceptance counter of i (AC
i
) with respect to the
whole set of items rated by the users. The second part
of the formula considers the average interest-degree
of all the features that describe the item i in both the
ontology (d
i f
) and in the user preferences (d
uf
).
The Weighted Hybrid Technique mixes the
items in the lists, but the value d
ui
is computed ac-
cording to the weight of the BRT that selected the
preference for which the item has been recommended.
The weight of each BRT, defined in the configuration
process, is denoted by ω
d
for the demographic RS,
ω
cb
for the content-based RS, ω
col
for the collabora-
tive RS and ω
gl
for the general-likes filtering. In this
case the value d
ui
of a tuple in RI
u
is calculated as:
d
ui
= percentile(AC
i
) + avg
f
((d
if
+ d
uf
) ×ω
brt
)
The Hybrid Technique obtains a list of ranked
items and retrieves the best N ranked elements.
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
214
GRSK Subsystem
User Interface Subsystem
User Profile
Recommended
Items
User info
Rated items
Recommended
items
Generalistic Database Interface
Subsystem
Items Users Profiles Ontology
Users
Classification
GRSK
Setup
Rated items
Figure 4: GRSK Integration into a System.
3 GRSK INTEGRATION
PROCESS
This section describes the GRSK requirements to be
integrated into any system (figure 4).
3.1 Database and External Subsystems
In first place, GRSK needs a database of the particu-
lar application domain containing: (1) the domain on-
tology; (2) the set of items that can be recommended:
these items must be classified according to the ontol-
ogy and the quality of the recommendation depends,
in part, on the accuracy of the classification of items;
(3) the user profiles with the demographic user infor-
mation (if the demographic RS is used in GRSK) and
the user general likes GL
u
(if the general-likes filter-
ing is used): the quality of the recommendation also
depends on the information provided by the user - the
more information, the more accurate the recommen-
dation -, but it is possible to obtain a recommendation
with a minimum amount of data; (4) demographic
classification of users according to the ontology.
In order to obtain a complete recommender sys-
tem, two external modules must be plugged to GRSK:
the Database Interface and the User Interface. The
Database Interface subsystem, which is the inter-
face between GRSK and the database, processes the
queries coming from GRSK, such as obtaining the
user profile of the current user or the list of items
that match a given preference. On the other hand,
the User Interface Subsystem initiates the execution
of GRSK and centralizes the exchange of information
between the user and GRSK. This includes convert-
ing the user data into a user profile, showing the list
of recommended items and recording the ones that are
selected and discarded by the user joint with the rat-
ing of the user satisfaction with a given recommended
item. The User Interface Subsystem is also in charge
of deciding which information must be initially intro-
duced by the user (which depends on the particular
application domain).
3.2 GRSK Setup
GRSK requires an initial configuration to adjust the
GRSK behaviour to the current application. First, it
is possible to select which BRT among all the avail-
able BRT (demographic RS, content-based RS, col-
laborative RS and general likes-based filtering), will
be used in GRSK to give a recommendation. Second,
it is necessary to select only one hybrid recommenda-
tion technique. Moreover, for all hybrid techniques, it
is possible to select the way to compute the interest-
degree of items in case an item is selected by more
than one preference. The techniques are: maximum
ratio, median ratio and several techniques to compute
the average.
On the other hand, some other computations can
be parameterized. For example, a threshold of the
interest-degree can be defined to consider or not a
given preference. Or the acceptance counter can be
computed in several ways.
4 CASE STUDIES
This subsection discusses the experiments conducted
to evaluate the behavior of GRSK.
Two domains have been used to evaluate GRSK: a
tourism domain and a movies domain. Through these
case studies, we will show that GRSK has been suc-
cessfully used in both cases.
In order to test GRSK, we selected two classical
Information Retrieval metrics: precision and recall.
In an Information Retrieval scenario, precision is de-
fined as the number of retrieved relevant items divided
by the total number of items retrieved by that search;
and recall is defined as the number of retrieved rel-
evant items divided by the total number of existing
relevant items. That is, precision represents the prob-
ability that a retrieved item is relevant to the user and
recall is the probability that a relevant item is retrieved
by the search.
Specifically, we call Ns the number of retrieved
items by GRSK, that is, the number of recommenda-
tions solicited by the user. The number of relevant
items is denoted by Nr and Nrs is the number of rel-
evant items retrieved in the recommendation, that is,
GRSK: A GENERALIST RECOMMENDER SYSTEM
215
Nrs = Nr Ns. Then, precision and recall are calcu-
lated as follows:
P =
Nrs
Ns
R =
Nrs
Nr
Often, there is an inverse relationship between P
and R, where it is possible to increase one at the cost
of reducing the other. For example, R can be in-
creased by increasing Ns, at the cost of increasing the
number of irrelevant items retrieved (decreasing P).
For this reason, P and R ratios are not discussed in
isolation.
We run our experiments in terms of two param-
eters, Ns the number of retrieved items, and the in-
formation about past visits in the user profile. As for
Ns, we run tests with Ns = 10 and Ns = 25. In both
experiments, we obtained the same list of retrieved
items, but in the first case, the system considered the
first 10 items and, in the second case, the first 25 items
were considered. Regarding the second parameter, we
took into account four levels of historical information
in the user profile; a new user and user profiles that
store 25%, 50% and 75% of (randomly selected) rated
items, respectively.
4.1 e-Tourism, a Touristic
Recommender System
e-Tourism is a web-based recommender system that
computes a user-adapted leisure and tourist plan for a
given user. The system does not solve the problem of
traveling to an specific place but it works on recom-
mending a list of the activities that a tourist can per-
form in the city of Valencia (Spain). It also computes
a time schedule for the list of recommended activi-
ties taking into account the distance between places,
the opening hours, etc. - that is, an agenda of ac-
tivities (Sebastia L., Garcia I., Onaindia E., Guzman
C., 2009). It is intended to be a service for foreigners
and locals to become deeply familiar with the city and
plan leisure activities.
4.1.1 Data Warehouse e-Tourism
As this is a new domain, a survey was filled by 58
people in order to obtain data for testing the system.
Personal data like name, age, marital status and tourist
profile (cultural, business, family, etc.) were col-
lected. They also identified sites already visited along
with a degree of interest (rating) for each site. There
are 115 preferences structured in the ontology (see
figure 1), 141 sites stored and 58 user profiles. Each
user rated (positively or negatively) all sites (RT
u
) and
an average of 110 preferences (GL
u
).
Figure 5: Comparison of the P and R values obtained when
Ns=10 and Ns=25 and for the four degrees of historical in-
formation for the tourism domain.
4.1.2 e-Tourism Experimental Results
When performing the experimental results in this do-
main, we divided the user profiles database into two
sets: 48 users were the training users and 10 users
were the test users. Then, as all users rated all sites,
we considered as relevant items (Nr) those visits that
the test users marked as visited with a positive degree
of satisfaction in the survey.
Figure 5 shows a comparison between the aver-
age of precision (P) and recall (R) for all the different
cases of user feedback. When Ns = 10, the difference
between the precision and the recall is remarkable,
and the precision decreases as the recall increases, as
expected. However, when Ns = 25, this difference is
not so noticeable. When Ns = 10 and the information
provided to the system increases (H = 25, H = 50),
GRSK improves the quality of the recommendations
if we consider P and R together. However, in some
of the cases in which the user feedback is rather high
(H = 75), the quality of the recommendation wors-
ens. This is because the database does not contain a
large number of items and, therefore, GRSK is not
able to recommend places other than those ones al-
ready visited by the user. When Ns = 25, the general
impression is similar. However, in this case, the rela-
tion P R is better because, although the precision is
a bit lower, the recall increases in a higher order. Here
again, the more feedback, the better the quality of the
recommendation, and, unlike the previous case, the
worsening in the case of H = 75 is not as noticeable.
4.2 e-Movies: a Movies Recommender
System
e-Movies is a application-based recommender sys-
tem that computes user tastes regarding preferences
movies for a given user, in order to obtain the best list
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
216
of movies for the user. It is intended to be a service for
any cinephile, working with a multitude of movies.
4.2.1 Data Warehouse e-Movies
In this case, we selected a well-known movies
database, MoviLens
2
, which has been created by the
GroupLens research group at the University of Min-
nesota. It contains 900 user profiles with their respec-
tive histories of interaction with the system and a set
of 1682 films. A user has scored between 20 and 700
movies. Each film is described by a title, number of
people who were recommended the film and watched
it, the year was recorded, etc. All the films have been
cataloged through an ontology of twenty preferences
(see figure 2). Each user has an average of 15 pref-
erences associated with several ratios (GL
u
) and has
rated an average of 45 movies (RT
u
). Moreover, each
movie has been rated by 57 different users in average
and has been described by 2 preferences in average.
4.2.2 e-Movies Experimental Results
When performing the experimental results in this do-
main, we divided the user profiles database into two
sets: 890 users were the training users and 10 users
were the test users. We considered as relevant items
(Nr) those movies that the test users have marked with
a value between 2 and 5.
Figure 6 shows a comparison between the aver-
age of precision (P) and recall (R) for all the different
cases of user feedback. In all cases, the difference
between the precision and the recall is quite remark-
able. The reason behind is that the number of relevant
items (Nr) is quite high compared to the number of
retrieved items as each test user has rated up to 685
movies and has an average of 551 movies. On the
other hand, we expected (as in the tourism domain)
that both measures (considered together) increased as
the user history also increased (except when H=75, as
explained above). However, figure 6 shows that when
H = 50 the precision decreases slightly. The reason is
the following. The precision P is calculated by tak-
ing into account a user history. Remember that the
possibility that a retrieved relevant item in Nrs was
not included in this Nr is not considered in P, there-
fore it must be satisfied that Nrs Nr. Thereby the
more feedback level, the lower Nr is observed, being
Ns constant. This is the reason why P with 25% is a
little bit better than P with 50%, because it is easier to
find a retrieved relevant item within the 75% user his-
tory (25% feedback) than with 50%. If we could ask
the user about his satisfaction with respect to a given
2
http://www.grouplens.org/
Figure 6: Comparison of the P and R values obtained when
Ns=10 and Ns=25 and for the four degrees of historical in-
formation for the movies domain.
recommendation, we would have a better picture of
the GRSK performance in this domain. This does not
happen in the tourism domain because we have a com-
plete feedback for all users. We also have the intuition
that a more complex ontology and a more complete
description of items (such as in the tourism domain)
improves the quality of the recommendations. How-
ever, we need to perform further experiments to con-
firm this intuition.
5 RELATED WORK
Some general-purpose domain independent open
source libraries and engines have been developed in
order to reuse the effort to design recommender sys-
tems. Some of these systems are: RACOFI (Ander-
son M., Ball M., Boley H., Greene S., Howse N.,
Lemire D., McGrath S., 2003), SUGGEST (Desh-
pande M., Karypis G., 2004), Vogoo (Lemire D., Mc-
Grath S., 2005), Taste
3
, CoFE (Ogston E., Bakker A.,
van Steen M., 2006), ColFi (Brozovsky L., 2006),
Duine Toolkit (van Setten M., Reitsma J., Ebben P.,
2006) and Aura (Lamere P., Green S., 2008).
Most of these engines are Java-based, with the ex-
ception of SUGGEST (C) and Vogoo (PHP). At this
moment, there are two versions of GRSK, written in
Java and in C#. GRSK Java version is agent-based, as
RACOFI and Aura.
Duine Toolkit, RACOFI and ColFi are developed
with a modular architecture that allows developers to
change and add algorithms easily, in the same manner
than GRSK. The GRSK configuration process allows
to select which techniques to use and to parameterize
different aspects of the recommendation process, in
order to adjust the GRSK behavior to the particular
3
http://www.opentaste.net/
GRSK: A GENERALIST RECOMMENDER SYSTEM
217
application domain.
Most of these systems are collaborative recom-
mendation engines (ColFi, Cofi, Taste, SUGGEST
and Vogoo). RACOFI, Aura and Duine Toolkit are hy-
brid recommendation engines. RACOFI adjusts a col-
laborative filter prediction with mechanisms coming
from content-based approaches. Aura uses collabora-
tive recommendation but uses a mechanism that as-
signs and processes a set of tags to items to improve
the recommendation. Duine Toolkit uses collabora-
tive and content-based techniques.
GRSK is an hybrid recommendation engine that
employs different basic and hybrid recommendation
techniques. The purpose of including these different
recommendation techniques is to make GRSK able
to work with any application domain, independently
from the number of users, the available user informa-
tion, etc. On the other hand, it is based on the seman-
tic description of the items in the domain.
6 CONCLUSIONS AND FURTHER
WORK
This paper describes the main characteristics of
GRSK, a Generalist Recommender System Kernel.
It is a RS based on the semantic description of the
domain, which allows the system to work with any
domain as long as the data of this domain can be
defined through an ontology representation. GRSK
uses four Basic Recommendation Techniques (de-
mographic, content-based, collaborative and general
likes filtering) and two disjunctive Hybrid Techniques
(mixed and weighed) that join the recommendations
obtained from each BRT. Through the GRSK config-
uration process, it is possible to select which tech-
niques to use and to parameterize different aspects
of the recommendation process, in order to adjust the
GRSK behavior to the particular application domain.
The experimental results show that GRSK can be suc-
cessfully used with different domains.
Now we are working in the extension of GRSK to
group recommendation (Garcia I., Sebastia L., Onain-
dia E., Guzman C., 2009). We are developing differ-
ent innovative techniques to compute the group pro-
file (such as the Incremental Intersection Technique).
In order to get closer the process of creating the group
profile to human behaviour, we are using agreement
techniques. More specifically, we are working on a
protocol of alternative offers between the group mem-
bers to obtain the preferences that will compose the
group profile.
ACKNOWLEDGEMENTS
Partial support provided by Consolider Ingenio
2010 CSD2007-00022, Spanish Government Project
MICINN TIN2008-6701-C03-01 and Valencian Gov-
ernment Project Prometeo 2008/051.
REFERENCES
Adomavicius G., Tuzhilin A. (2005). Toward the next
generation of recommender systems: A survey of
the state-of-the-art and possible extensions. IEEE
Transactions on Knowledge and Data Engineering,
17(6):734–749.
Anderson M., Ball M., Boley H., Greene S., Howse N.,
Lemire D., McGrath S. (2003). Racofi: Rule-applying
collaborative filtering systems. In IEEE WIC COLA.
Brozovsky L. (2006). Recommender system for a dat-
ing service. Master’s thesis, KSI, MFF UK, Prague,
Czech Republic.
Burke R. (2007). The Adaptive Web, chapter Hybrid
web recommender systems, pages 377–408. Springer
Berlin / Heidelberg.
Deshpande M., Karypis G. (2004). Item-based top-n rec-
ommendation algorithms. ACM Transactions on In-
formation Systems, 22(1):143–177.
Garcia I., Sebastia L., Onaindia E., Guzman C. (2009).
A group recommender system for tourist activities.
In International Conference on Electronic Commerce
and Web Technologies (EC-Web).
Lamere P., Green S. (2008). Project aura - recommendation
for the rest of us. JavaOne.
Lemire D., McGrath S. (2005). Implementing a rating-
based item-to-item recommender system in php/sql.
D-01, Ondelette.com.
Ogston E., Bakker A., van Steen M. (2006). On the value
of random opinions in decentralized recommendation.
In Distributed applications and interoperable systems
(DAIS).
Pazzani M.J. (1999). A framework for collaborative,
content-based and demographic filtering. Artificial In-
telligence Review, 13:393–408.
Resnick P., Varian H. (1997). Recommender systems. Com-
munications of the ACM, 1997, 40(3).
Sebastia L., Garcia I., Onaindia E., Guzman C. (2009). e-
Tourism: a tourist recommendation and planning ap-
plication. International Journal on Artificial Intelli-
gence Tools (WSPC-IJAIT), 18(5):717–738.
Tao Li, Anand S.S. (2009). Exploiting domain knowledge
by automated taxonomy generation in recommender
systems. In EC-Web 2009, T. Di Noia and F. Buc-
cafurri eds, LNCS 5692, pages 120–131. Springer-
Verlag.
van Setten M., Reitsma J., Ebben P. (2006). Duine toolkit -
user manual. Technical report, Telematica Instituut.
WEBIST 2010 - 6th International Conference on Web Information Systems and Technologies
218