An Ontology-based Method for Sparsity Problem in Tag

Recommendation

Endang Djuana

, Yue Xu

, Yuefeng Li

, Audun Josang

and Clive Cox

School of Electrical Engineering and Computer Science, Queensland University of Technology, Brisbane, Austalia

Department of Informatics, University of Oslo, Oslo, Norway

Rummble Ltd, London, U.K.

Keywords:

Collaborative Tagging, Tag Recommendation, Domain Ontology, Folksonomy, Sparsity Problem.

Abstract:

Tags or personal metadata for annotating web resources have been widely adopted in Web 2.0 sites. However,

as tags are freely chosen by users, the vocabularies are diverse, ambiguous and sometimes only meaningful to

individuals. Tag recommenders may assist users during tagging process. Its objective is to suggest relevant

tags to use as well as to help consolidating vocabulary in the systems. In this paper we discuss our approach for

providing personalized tag recommendation by making use of existing domain ontology generated from folk-

sonomy. Speciﬁcally we evaluated the approach in sparse situation. The evaluation shows that the proposed

ontology-based method has improved the accuracy of tag recommendation in this situation.

1 INTRODUCTION

Tags or personally supplied keywords for describing

web resources have been widely adopted in Web 2.0

sites. This facility can be found in social bookmark-

ing such as Bibsonomy

, multimedia sharing such as

YouTube

, e-commerce such as Amazon

and more

recently micro-blogs such as Twitter

as well.

Tags are freely chosen words which act as annota-

tion or metadata for describing web resources which

can be used for personal organization, easy retrieval

or ﬁnding related resources (Marlow et al., 2006). As

users build up their tags collection, the aggregates of

all users vocabulary may exhibit an informal taxon-

omy system which is known as folksonomy or folks

taxonomy (Mathes, 2004).

However, since tags are chosen freely by users,

tags vocabularies are diverse, potentially ambiguous

and sometimes only meaningful to individuals. Be-

sides variations in format such as plurality, pre- and

suf-(ﬁxes), and case variations, these tags may also

have polysemy (multiple meaning), synonymy and

generality problems (Golder and Huberman, 2006)

(Liang et al., 2010).

http://www.bibsonomy.org

http://www.youtube.com

http://www.amazon.com

https://twitter.com/

Tag recommendersytems are a specialized recom-

mender system for suggesting tags for annotating web

resources. Speciﬁcally, a tag recommender system

will recommend for a given user and a given item,

a set of tags for annotating the item. Its objectives

were to provide relevant tags and help consolidate the

annotation vocabulary in the systems (J¨aschke et al.,

2008).

In this paper we present a tag recommender ap-

proach which aims to solve problems with ambigu-

ities and generality during recommendation process.

We utilize an existing domain ontology generated

from folksonomy to represent tags as concepts with

their relationships (Djuana et al., 2012). For a given

user and a given item, by using the user-based collab-

orative ﬁltering technique, a set of candidate tags can

be produced. In this paper, we propose a method to

expand the candidate tags by including more general

or speciﬁc concepts based on the domain ontology.

Moreover, we conduct experimentsto evaluate perfor-

mance of the proposed approach in sparse situation in

which tag recommendation needs to be produced for

users with less items, items with less tags and tags

with less users.

This paper is structured as follows. We introduce

some key concepts in Section 2. Then we present

problem formulation and related works in Section 3.

Section 4 discusses the adopted domain ontology as

467

Djuana E., Xu Y., Li Y., Josang A. and Cox C..

An Ontology-based Method for Sparsity Problem in Tag Recommendation.

DOI: 10.5220/0004439904670474

In Proceedings of the 15th International Conference on Enterprise Information Systems (ICEIS-2013), pages 467-474

ISBN: 978-989-8565-60-0

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

basis for recommendation framework. Section 5 dis-

cusses the proposed tag recommendation approach.

Section 6 discusses evaluation and experiment results.

And in Section 7 we conclude this paper and discuss

some ideas for future work.

2 KEY CONCEPTS

In this section we introduce collaborativetagging sys-

tem, tag recommendation, and ontology from folk-

sonomy.

2.1 Collaborative Tagging Systems

A collaborativetagging system contains three entities:

users, tags, and items, which are described below:

• Users U={u

, ...u

|U|

} contains all users in an

online community who have used tags to organize

their items.

• Tags T={t

, . ..t

|T|

} contains all tags used by

the users in U. Tags are typically arbitrary strings

which could be a single word or short phrase. In

this paper, a tag is deﬁned as a sequence of terms.

Fort ∈ T,t = hterm

,term

,.. .term

i A function

is deﬁned to return the terms in a tag: tagset(t) =

{term

,term

,.. .term

}

• Items I={i

, ...i

|I|

} contains all domain-

relevant items or resources. What is considered

by an item depends on the type of user tagging

collection, for instance, in Bibsonomy the items

are mainly bookmarks and publications.

Based on these three entities, a collaborative tag-

ging system is formulated as Folksonomy which con-

sists of 4-tuple: F = (U, T, I,Y) where U,T, I are ﬁ-

nite sets, whose elements are the users, tags and items,

respectively. Y is a ternary relation between them, i.e.,

Y ⊆ U × T × I, whose elements are called tag assign-

ments or taggings. An element (u,t,i) ∈ Y represents

that user u collected item i using tag t. A function

Ft(u,i) is deﬁned to return a set of tags that a user u

has assigned to an item i whereby Ft(u,i) = {t ∈ T |

(u,t,i) ∈ Y} for all u ∈ U and i ∈ I.

2.2 Tag Recommendation

A tag recommender system is a speciﬁc kind of rec-

ommender systems in which the goal is to recommend

a set of tags to use for a particular item. Based on

previous formulation of Folksonomy, the task of a tag

recommender system is to recommend, for a given

user u ∈ U and a given item i ∈ I which has not been

tagged by the user or Ft(u,i) =

0, a set

T(u,i) ⊆ T

of tags. In many cases

T(u, i) ⊆ T is computed by

ﬁrst generating a ranking on the set of tags accord-

ing to some criterion, for instance by a collaborative

ﬁltering, content based, or other recommendation al-

gorithms, from which then the top n tags are selected.

2.3 Ontology from Folksonomy

Ontology is formal description and explicit speciﬁ-

cation of a shared conceptualization (Gruber, 1993).

Depending on the types of stored knowledge, ontol-

ogy can be categorized in two types: domain ontol-

ogy and general ontology (Navigli et al., 2003). Gen-

eral ontology deﬁnes concepts that are general for all

domains. Domain ontology forms the core of any

knowledge speciﬁcally for the domain.

Folksonomy which is emerging from collabo-

rative tagging has been acknowledged as potential

source for constructing ontology. As it captures vo-

cabulary of users which may be aggregated to produce

emergent semantics, people may develop lightweight

ontologies (Mika, 2007).

3 MOTIVATION

In this section we discuss main motivation for this

work by formulating main problem to solve and re-

view related works for the proposed method and prob-

lem solution.

3.1 Problem Formulation

Sparsity problem (Adomavicius and Tuzhilin, 2005)

refers to users who have rated very few items or items

which have received very few ratings. In tag recom-

mendation context, sparsity refers to users who tag a

few or very few resources, and in some situation only

one resource. It could also mean there are resources

which received very few annotations and there are

some tags which are only used by very few users.

Cold start (Schein et al., 2002) (Adomavicius and

Tuzhilin, 2005) is a speciﬁc situation in recommender

systems context when a new user arrives or new re-

source exists. In tag recommendation, each post con-

sists of a user, a resource and all tags that this user

has assigned to that resource. In this regards, cold-

start problem may be formulated as (1) new user who

posted for the ﬁrst time in the system or in other

words, all posts of a new user are in the test set or

(2) new resource, on the other hand, refers to a re-

source that has never been tagged before by any other

user (Preisach et al., 2010).

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

468

For these situations, most of the state of the art tag

recommendation methods perform poorly (Preisach

et al., 2010). In this paper we aim to present a tag

recommendation method which may alleviate these

problems by utilizing domain ontology generated

from folksonomy.

3.2 Related Works

Tag recommender systems are broadly divided into

three classes: content-based, collaborative ﬁltering,

and graph-based approaches (Musto et al., 2010).

One early content-based tag recommender is the

work by (Brooks and Montanez, 2006). The state of

the art works in this class include the approach by

(Tatu et al., 2008) which mapped textual contents in

Bibsonomy bookmarks, not just the tags, to concepts

in WordNet and a similar approach by (Lipczak et al.,

2009) which explored resource content as well as re-

source and user proﬁles. However, there is a draw-

back that these works relied on extended textual con-

tents provided by Bibsonomy which are not always

available in other collaborative tagging systems.

The baseline tag recommender system in col-

laborative class is the user-based CF (Marinho and

Schmidt-Thieme, 2008). There is also a notable work

by (Sigurbj¨ornsson and van Zwol, 2008) which is

based on tag co-occurrences. Although this work has

achieved good result, it didn’t rely on actual mean-

ing of tags which may miss the semantic relationships

among tags.

The most notable works in graph-based ap-

proaches are the work by (J¨aschke et al., 2008) which

utilized a graph-based tag ranking method named

FolkRank (Hotho et al., 2006) and the work by

(Symeonidis et al., 2008) which proposed the Tensor

Dimensionality Reduction method.

There wasn’t much work done in using domain

ontology for tag recommendation. Beside the work

proposed in this paper there is a work by (Baruzzo

et al., 2009) which used existing domain ontology to

recommend new tags by analyzing textual content of

a resource needed to be tagged. However, they didn’t

provide quantitative evaluation.

Most of the state of the art works in tag recom-

mendation are evaluated on dense datasets and rarely

on sparse datasets. The work we described in this

paper is a tag recommender approach which com-

bines collaborative ﬁltering and graph-based method

but not utilizing content-based methods. Although

content-based methods may achieve good results for

cold-start situation, they may not be applicable to all

collaborative tagging systems because they rely on

extra information on resources which are not always

available. It also may not be practical since for dif-

ferent content type they will need different version of

the algorithm (Preisach et al., 2010).

4 ONTOLOGY SPECIFICATION

In this section we specify ontology speciﬁcation

which we are going to utilize in the tag recommen-

dation approach proposed in this paper. Speciﬁcally

this ontology speciﬁcation discusses one particular

approach that we have chosen for semantic and per-

sonalization capability of the generated domain on-

tology. Readers are referred to (Djuana et al., 2012)

for more detailed discussion.

In order to use existing domain ontology gen-

erated from folksonomy, we specify the criteria in

which the ontology has to conform to. This ontol-

ogy of tags should come from a general ontology with

good coverage. In particular this general ontology

needs to have synonym terms (synset) or the like in

their concept, and also general category or taxonomic

grouping system such as WordNet (Fellbaum, 1998).

It was expected by conducting ontology learning

process, domain ontology which represents a partic-

ular tag collection can be generated. There are 3

stages in domain ontology generation process which

are: mapping tags to concepts, mapping disambigua-

tion and relationships extraction.

It is possible that a tag can map directly to one of

synonym terms of a concept in the backbone ontol-

ogy. In other cases, only part of a tag that can map to

one of synonym terms. These cases where handled by

three mapping approaches which are (1) whole map-

ping (2) partial mapping and (3) term mapping.

After all possible mappings are found, the next

stage was mapping disambiguation to choose the most

appropriate concept from mapped concepts to rep-

resent the meaning of a tag for this particular tag

collection. Two disambiguation strategies were per-

formed which are (1) disambiguation by frequency

which comes from an expert point of view about gen-

eral meaning of tags. This mapping strength comes

from frequency in a representative corpus of doc-

uments which indicate how frequent one particular

synonym term would be used to represent the mean-

ing of concept that contains these terms; (2) disam-

biguation by tag relevance which comes from users

point of view about a personal meaning in tags collec-

tion. This mapping strength comes from tag relevance

in relation to similar users understanding and usage

of tags. Given a related tags that has been used for an

item, this mapping is chosen according to relevance to

other tags. After mapping disambiguation, each tag t

AnOntology-basedMethodforSparsityProbleminTagRecommendation

469

will map to one and only one concept. In the end,

conﬁrmed mappings according to two disambigua-

tion strategies were: M

frequency

(t) and M

relevance

(t).

Based on tag to concept mapping, available relation-

ships (”is-a” relation) among concepts in general on-

tology were extracted to form domain ontology.

5 PROPOSED APPROACH

The proposed recommendation approach consists of

two parts. The ﬁrst part is the user-based collab-

orative ﬁltering (CF) tag recommendation approach

(user-based CF) (J¨aschke et al., 2008) (Marinho and

Schmidt-Thieme, 2008). This part will also serve as

a baseline tag recommender for evaluation purpose.

However, this approach may not be able to solve am-

biguities problem since it can only recommend pre-

viously used tags and may not be able to recommend

semantically related tags. Therefore, for the second

part we proposed the ontology-based concept expan-

sion tag recommendation approach. In this approach

we utilize concepts and relationships which have been

consolidated in the existing domain ontology. We at-

tempt to improve the user-based CF by utilizing ex-

panding vocabulary of recommended tags by mak-

ing use of synonym terms and semantic relationships

among related concepts in the ontology.

5.1 User-based CF Method

In the traditional user-based CF recommender sys-

tems for recommending items, user proﬁles are rep-

resented in a |U| × |I| user-item matrix X, where |U|

represents number of users and |I| represents number

of items. For each row vector:

−→

= [x

u,1

,.. .x

u,|I|

], for

u = 1, ... ,|U|, x

u,i

indicates that user u rated item i by

a rating value. Each row vector

−→

corresponds thus

to a user proﬁle representing the users preferences to

the items.

Based on the proﬁle matrix X, the neighbourhood

of the most similar k users to the user u can be com-

puted as follows:

= argmax

v∈U

sim(

−→

) (1)

where sim(

−→

) is the similarity between user u and

another user v. It can be calculated using a similarity

calculation method such as cosine similarity.

However, because of the ternary relational na-

ture of user tagging system, the traditional user-item

matrix X cannot be applied directly in tag recom-

menders, unless the ternary relation Y is reduced to

a lower dimensional space (Marinho and Schmidt-

Thieme, 2008).

In order to apply the user-based CF, the ternary

relation Y can be used to generate a |U| × |I| matrix

= [

−→

...

−→

|I|

], called user-item(tag) matrix, with

−→

= [x

u,1

,.. .x

u,|I|

], for u = 1,.. .,|U|, x

u,i

∈ {0,1} in-

dicating that, there exists tags used by user u to tag

item i if x

u,i

= 1, otherwise no tags have been used by

user u to tag this item.

In the experiment, we implemented the user-item

(tag) projection as the user proﬁle matrix for calculat-

ing user neighbourhood. The user-item (tag) matrix is

a binary matrix. Jaccards coefﬁcient is used to mea-

sure the similarity of two binary vectors.

In this user-based CF method in order to recom-

mend tags to a target user for tagging a particular

item, it ﬁrst generates a set of candidate tags which

have been used by other users (usually neighbour

users) to tag the item that target user is concerned.

It then ranks the candidate tags based on the similar-

ity between target user and other users to decide top n

tags as the ﬁnal recommendations.

Let CT(u,i) be a set of tags which have been used

by u’s neighbors to tag item i. CT(u,i) are the candi-

date tags to be selected to generate recommendations

to u for tagging i. For a candidate tag t in CT(u, i), its

ranking can be calculated by the following equation:

w(u,t,i) =

∑

v∈N

sim(

−→

) ∗ δ(v,t, i),

δ(v,t,i) =



1 (v,t,i) ∈ Y

0 otherwise



(2)

where δ(v,t,i) = 1 indicates if the user v has used this

tag t to tag the item i, N

is the neighborhood of user

u. The top n tags can be determined based on the

ranking:

T(u,i) = arg

max

t∈T

w(u,t,i) (3)

5.2 Ontology-based Expansion Method

In this paper,we propose a method to improvethe per-

formance of the user-based CF (J¨aschke et al., 2008)

(Marinho and Schmidt-Thieme, 2008) described in

Section 5.1 (also serves as baseline recommender).

In the proposed method, we generate candidate

tags by utilizing the synonym set (synset) informa-

tion captured in the tag ontology and rank candidate

tags based on both user similarity and tag popular-

ity. The recommendations generated by baseline rec-

ommender and tag ontology based recommender de-

scribed in this Section are compared to evaluate the

improvement achieved by the expansion method. The

experiments and evaluation are provided in Section 6.

It is a well known insight to explore the possibility

of using a more general or more speciﬁc tag in rec-

ommending a new vocabulary to user. It is related to

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

470

a characteristic known as the basic level variations or

generality in collaborative tagging (Golder and Hu-

berman, 2006) in which certain users tend to use a

more general vocabulary while other users tend to use

a more speciﬁc vocabulary.

Therefore in an expansion to current approach we

introduce candidate tag expansion method which ex-

pands candidate tags to include the parent (more gen-

eral) concept and the children (more speciﬁc) con-

cepts as well as the basic level concepts. Each of these

concepts will need to be ranked according to seman-

tic relatedness measure to determine the closeness to

current mapped concepts. These methods will be de-

scribed below.

5.2.1 Candidate Tag Expansion

Let CT(u,i) be the set of candidate tags generated

based on neighbour users preferences. For each can-

didate tag t in CT(u,i), by using the disambiguation

mapping methods as described in Section 4, t can be

mapped to concepts M

frequency

(t) or M

relevance

(t) in

the tag ontology, respectively. For this step we will

have 2 different sources of candidate tags as follows:

1. Basic Level Tag Expansion In this strategy, for

mapped concepts which we identify as basic level

expanded concepts, from the synset terms of these

concepts, two expanded sets of candidate tags can

be generated based on the two methods:

Exp

basic

frequency

(u,i) =

[

synset(M

frequency

(t))

t∈CT(u,i)

(4)

Exp

basic

relevance

(u,i) =

[

synset(M

relevance

(t))

t∈CT(u,i)

(5)

2. Parent-Children Level Tag Expansion

In this strategy, for parent and children concepts

which we identify as more general and more spe-

ciﬁc concepts, we deﬁne two functions for retriev-

ing those concepts. Let c be a concept, parent(c)

be the parent concept of c, and children(c) be the

set of children concepts of c. For a tag t, the set

of its parent and children concepts are deﬁned be-

low:

frequency

(t) = {parent(M

frequency

(t))}

[

children(M

frequency

(t)) (6)

relevance

(t) = {parent(M

relevance

(t))}

[

children(M

relevance

(t)) (7)

From these parent and children concepts, another

two expanded sets of candidate tags can also be

generated:

Exp

frequency

(u,i) =

[ [

synset(M

frequency

(t))

t∈CT(u,i)

c∈PC

frequency

(t)

(8)

Exp

relevance

(u,i) =

[ [

synset(M

relevance

(t))

t∈CT(u,i) c∈PC

relevance

(t)

(9)

5.2.2 Recommendation Ranking

For this step we will also have 2 different cases for

ranking calculation as follows:

1. Basic Level Tag Expansion

For each of the candidate tag t in CT(u,i),

Exp

basic

frequency

(u,i) or Exp CT

basic

relevance

(u,i), its

ranking is calculated by the following equation:

(u,t,i) =











∑

v∈N

sim(

−→

) ∗ δ(v,t, i) t ∈ CT(u,i)

∑

v∈N

sim(

−→

) ∗ δ(v,t, i) ∗ P (t) t /∈ CT(u,i),

t∈Exp

basic

(u,i)

(10)

where γ ∈ { frequency,relevance} and P (t) is the

popularity of tag t, which is calculated as:

P (t) =| UI

| / max

∈T

| UI

P (t) is the ratio between | UI

| and the maximum

number of times that a tag has been used to tag

items in this tagging community. As has been

deﬁned in (Djuana et al., 2012), | UI

| contains

(user, item) pairs representing the tag assignments

using tagt. | UI

| is the number of times that t has

been used to tag items. The higher the | UI

|, the

more popular tag t is.

2. Parent-Children Level Tag Expansion

On the other hand for each t of candidate tags

which are not original candidate tags in CT(u,i)

or the expanded basic tags in Exp

basic

(u,i),

AnOntology-basedMethodforSparsityProbleminTagRecommendation

471

t must be a parent or a child of a original can-

didate tag, i.e., t ∈ Exp CT

frequency

(u,i) or t ∈

Exp

relevance

(u,i). The ranking of tag t is cal-

culated by the following equation:

(u,t,i) =

∑

v∈N

sim(

−→

) ∗ δ(v,t, i) ∗ P (t) ∗ S (t,t

)

t /∈ CT(u, i),t ∈ Exp

(u,i)

(11)

where S (t,t

) is the normalized similarity value

between tag t and its original candidate tag based

on WordNet similarity measures (semantic dis-

tance). In this approach we use Jiang-Conrath

similarity measures which are based on informa-

tion content in the glosses (Jiang and Conrath,

1997). We use the implementation in WordNet

Similarity package (Pedersen et al., 2004). The

more they closer in semantic distance, the higher

the similarity value will be.

6 EVALUATION

In this section, ﬁrst we discuss experiments setup then

we present experiment results and discussion.

6.1 Experiments Setup

We have conducted experiments mainly using the

dataset for ECML PKDD Discovery Challenge 2009

which is summarized in (J¨aschke et al., 2012).

The dataset originated from Bibsonomy contains

two versions of training data: (1) snapshot of almost

all dumps of Bibsonomy and (2) dense part of the

snapshot. The dense part contains training data which

has been ﬁltered to include only users, resources or

tags that appear in at least two posts. This is also

known as post core calculation (Batagelj and Zaver-

snik, 2002) at level 2.

The dataset also contains two separate test data

for (1) Content-based method whereby test data con-

tained posts whose user, resource or tag were not con-

tained in the dense part of training data and (2) Graph-

based method otherwise. Table 1 and 2 summarized

the statistics of the dataset.

We have simulated two situations in tag recom-

mendation context which are tag recommendation

using dense dataset and tag recommendation using

sparse dataset.

For dense dataset we use the dense part of snap-

shot data in Table 1 for training data, and Graph based

data in Table 2 for testing data. In this simulation,

Table 1: Training data statistics.

Statistics (until Dec. Overall Dense part

2008) Snaphot of Snapshot

#posts 421,928 64,120

#resources 378,378 22,389

#users 3,617 1,185

#tags 93,756 13,252

Table 2: Testing data statistics.

Statistics (Jan 1

- Task 1 Task 2

June 30

2009) Content-based Graph-based

#posts 43,002 778

#resources 40,729 667

#users 1,591 136

#tags 34,051 862

all users, items and tags in test dataset are all con-

tained in training data. For sparse dataset we use the

entire snapshot data in Table 1 for training data, and

Content based data in Table 2 for testing data. In this

simulation, users, items or tags in test data were not

contained in the dense part of training data which sim-

ulate sparse users and may contain cold start users or

items.

Top N tags are recommended to each target user

for one random user’s items in testing set. The rec-

ommended tags are compared to target users actual

tags of items in testing dataset. If a recommended tag

matches with an actual tag, we calculate this as a hit.

The standard precision and recall calculation are used

to evaluate the accuracy of tag recommendations.

We have conducted following runs to compare

performance between the baseline recommender and

the proposed methods.

• User-CF: this is the user-based CF tag recom-

mender system which is the baseline (section 5.1)

• Exp

Freq User Syn: this is the basic level synset

expansion based on frequency over user-based CF

(section 5.2)

• Exp

Rel User Syn: this is the basic level synset

expansion based on tag relevance over user-based

CF (section 5.2)

• Freq&Rel

User Syn: this is the combination of

Exp Freq User Syn and Exp Rel User Syn

• Exp

Freq User PC: this is the parent chil-

dren synset expansion based on frequency over

Exp

Freq User Syn (section 5.2)

• Exp

Rel User PC: this is the parent children

synset expansion based on tag relevance over

Exp

Rel User Syn (section 5.2)

• Freq&Rel

User PC: this is the combination of

Exp Freq User PC and Exp Rel User PC

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

472

• Folkrank TR: this is the state of the art graph-

based tag recommender (J¨aschke et al., 2008).

6.2 Results and Discussion

For the recommendation using dense dataset the re-

sults are depicted in Table 3 and Table 4 while for the

recommendation using sparse dataset the results are

depicted in Table 5 and Table 6.

Table 3: Precision for Dense Recommendation.

N 5 10 15 20

User-CF 0.183 0.103 0.070 0.052

Exp Freq User Syn 0.217 0.139 0.101 0.078

Exp Rel User Syn 0.217 0.140 0.103 0.079

Freq&Rel User Syn 0.218 0.142 0.104 0.081

Exp Freq User PC 0.221 0.144 0.103 0.079

Exp Rel User PC 0.221 0.145 0.104 0.080

Freq&Rel User PC 0.222 0.147 0.104 0.081

Folkrank TR 0.241 0.150 0.108 0.084

Table 4: Recall for Dense Recommendation.

N 5 10 15 20

User-CF 0.435 0.474 0.479 0.479

Exp Freq User Syn 0.477 0.503 0.507 0.512

Exp Rel User Syn 0.479 0.505 0.509 0.515

Freq&Rel User Syn 0.482 0.509 0.514 0.518

Exp Freq User PC 0.515 0.562 0.574 0.587

Exp Rel User PC 0.518 0.570 0.582 0.591

Freq&Rel User PC 0.523 0.578 0.588 0.593

Folkrank TR 0.576 0.685 0.726 0.750

Table 5: Precision for Sparse Recommendation.

N 5 10 15 20

User-CF 0.074 0.059 0.053 0.051

Exp Freq User PC 0.207 0.124 0.073 0.055

Exp Rel User PC 0.207 0.125 0.075 0.056

Freq&Rel User PC 0.208 0.126 0.077 0.058

Folkrank TR 0.205 0.121 0.066 0.051

For the recommendation using dense dataset we

are mainly observing how expansion by basic level

and parent-child synset expansion as well as com-

bined method may improve signiﬁcantly over the

baseline recommender.

For the recommendation using sparse dataset we

are mainly observing whether or not the proposed ex-

pansion methods can outperform the state of the art

recommenders which are normally perform well in

dense situation but not in sparse situation.

The results of recommendation in dense dataset

in Table 3 and Table 4 show that the basic level

candidate tag expansion has improved the user-based

CF quite signiﬁcanty in precision and in recall while

the parent-children expansion has improved further

in precision and in recall over basic level expansion

Table 6: Recall for Sparse Recommendation.

N 5 10 15 20

User-CF 0.169 0.238 0.302 0.340

Exp Freq User PC 0.513 0.572 0.582 0.586

Exp Rel User PC 0.514 0.574 0.583 0.588

Freq&Rel User PC 0.516 0.576 0.587 0.589

Folkrank TR 0.491 0.561 0.574 0.585

alone. Although these results are still lower than

FolkRank results, the gap is getting closer for preci-

sion in higher number of N which shows potential of

ontology based concept expansion for covering gaps

in collaborative ﬁltering methods.

As we predicted Folkrank didn’t perform that well

for sparse dataset as shown in Table 5 and Table 6. In

this situation, the combination of all expansion meth-

ods has performed better than FolkRank in precision

and in recall. If we look more closely at the results

then the improvement to precision is more apparent

for higher number of N while improvement to recall

is more apparent for lower number of N. These results

shows that proposed expansion method help avoid

sudden decline in precision curve (maintaining accu-

racy) and help boost recall in ﬁrst few recommended

tag which are mostly more relevant tags.

Based on these results we may conclude that the

proposed methods are quite effective for alleviating

sparsity situation.

7 CONCLUSIONS

In this paper we have presented a tag recommen-

dation approach which utilizes an existing domain

ontology generated from Folksonomy for improving

user-based CF method by expanding original candi-

date tags. We have presented an ontology-based ex-

pansion method which expands basic level tags and

includes more general and more speciﬁc tags. We

found that the expansion method based on basic level

expansion improved the baseline method quite signif-

icantly. Speciﬁcally, in sparse and cold-start situa-

tions, the combination of all expansion methods has

improved the accuracy even better than the state of

the art graph based recommendation method which

shows the potential of effectiveness in sparse situa-

tion.

ACKNOWLEDGEMENTS

This work is part of ARC Linkage Project

(LP0776400) supported by the Australian Research

Council.

AnOntology-basedMethodforSparsityProbleminTagRecommendation

473

REFERENCES

Adomavicius, G. and Tuzhilin, A. (2005). Toward the

next generation of recommender systems: A survey

of the state-of-the-art and possible extensions. IEEE

Transactions on Knowledge and Data Engineering,

17(6):734–749.

Baruzzo, A., Dattolo, A., Pudota, N., and Tasso, C. (2009).

Recommending new tags using domain-ontologies.

In Web Intelligence/IAT Workshops, pages 409–412.

IEEE.

Batagelj, V. and Zaversnik, M. (2002). Generalized cores.

cite arxiv:cs.DS/0202039.

Brooks, C. H. and Montanez, N. (2006). Improved anno-

tation of the blogosphere via autotagging and hierar-

chical clustering. In Proceedings of the 15th interna-

tional conference on World Wide Web, pages 625–632.

ACM.

Djuana, E., Xu, Y., Li, Y., and Cox, C. (2012). Person-

alization in tag ontology learning for recommendation

making. In 14th International Conference on Informa-

tion Integration and Web-based Applications & Ser-

vices (iiWAS 2012), pages 368–377. ACM.

Fellbaum, C. (1998). WordNet: An Electronic Lexical

Database. Language, Speech, and Communication

Series. MIT Press.

Golder, S. A. and Huberman, B. A. (2006). Usage patterns

of collaborative tagging systems. Journal of Informa-

tion Science, 32(2):198–208.

Gruber, T. R. (1993). A translation approach to portable

ontology speciﬁcations. Knowledge Acquisition,

5(2):199–220.

Hotho, A., Jschke, R., Schmitz, C., and Stumme, G. (2006).

Information retrieval in folksonomies: Search and

ranking. In The Semantic Web: Research and Appli-

cations, pages 411–426. Springer Berlin Heidelberg.

J¨aschke, R., Hotho, A., Mitzlaff, F., and Stumme, G. (2012).

Challenges in tag recommendations for collaborative

tagging systems. In Recommender Systems for the So-

cial Web, pages 65–87. Springer Berlin Heidelberg.

J¨aschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L.,

and Stumme, G. (2008). Tag recommendations in so-

cial bookmarking systems. AI Commun., 21(4):231–

247.

Jiang, J. J. and Conrath, D. W. (1997). Semantic similarity

based on corpus statistics and lexical taxonomy. In

Proceedings of International Conference Research on

Computational Linguistics (ROCLING X’1997).

Liang, H., Xu, Y., Li, Y., Nayak, R., and Tao, X. (2010).

Connecting users and items with weighted tags for

personalized item recommendations. In Proceedings

of the 21st ACM conference on Hypertext and hyper-

media, HT ’10, pages 51–60. ACM.

Lipczak, M., Hu, Y., Kollet, Y., and Milios, E. (2009). Tag

sources for recommendation in collaborative tagging

systems. In ECML PKDD Discovery Challenge 2009

(DC09), volume 497 of CEUR-WS.org, pages 157–

172.

Marinho, L. and Schmidt-Thieme, L. (2008). Collabora-

tive tag recommendations. In Data Analysis, Machine

Learning and Applications, pages 533–540. Springer

Berlin Heidelberg.

Marlow, C., Naaman, M., Boyd, D., and Davis, M. (2006).

Ht06, tagging paper, taxonomy, ﬂickr, academic arti-

cle, to read. In Proceedings of the seventeenth con-

ference on Hypertext and hypermedia, pages 31–40.

ACM.

Mathes, A. (2004). Folksonomies - cooperative classiﬁ-

cation and communication through shared metadata.

http://www.adammathes.com/academic/computer-me

diated-communication/folksonomies.html. Accessed:

2013-01-25.

Mika, P. (2007). Ontologies are us: A uniﬁed model of so-

cial networks and semantics. Web Semantics: Science,

Services and Agents on the World Wide Web, 5(1):5–

15.

Musto, C., Narducci, F., Lops, P., and Gemmis, M.

(2010). Combining collaborative and content-based

techniques for tag recommendation. In E-Commerce

and Web Technologies, pages 13–23. Springer Berlin

Heidelberg.

Navigli, R., Velardi, P., and Gangemi, A. (2003). Ontology

learning and its application to automated terminology

translation. IEEE Intelligent Systems, 18(1):22–31.

Pedersen, T., Patwardhan, S., and Michelizzi, J. (2004).

Wordnet: : Similarity - measuring the relatedness of

concepts. In AAAI, pages 1024–1025. AAAI Press.

Preisach, C., Balby Marinho, L., and Schmidt-Thieme, L.

(2010). Semi-supervised tag recommendation - using

untagged resources to mitigate cold-start problems. In

Advances in Knowledge Discovery and Data Mining,

pages 348–357. Springer Berlin Heidelberg.

Schein, A. I., Popescul, A., Ungar, L. H., and Pennock,

D. M. (2002). Methods and metrics for cold-start rec-

ommendations. In Proceedings of the 25th annual in-

ternational ACM SIGIR conference on Research and

development in information retrieval, pages 253–260.

ACM.

Sigurbj¨ornsson, B. and van Zwol, R. (2008). Flickr tag

recommendation based on collective knowledge. In

Proceedings of the 17th international conference on

World Wide Web, pages 327–336. ACM.

Symeonidis, P., Nanopoulos, A., and Manolopoulos, Y.

(2008). Tag recommendations based on tensor dimen-

sionality reduction. In Proceedings of the 2008 ACM

conference on Recommender systems, pages 43–50.

ACM.

Tatu, M., Srikanth, M., and D’Silva, T. (2008). Rsdc’08:

Tag recommendations using bookmark content. In

Proceedings of the ECML/PKDD 2008 Discovery

Challenge Workshop, pages 96–107.

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

474