ploiting the keyphrase lists extracted from the papers
that are considered and explicitly stated as relevant
by the active user. Then, in order to compute the rel-
evance of a new article, the user profile is matched
against the keyphrase list extracted from that article.
The domain-independent keyphrase extraction avoids
a manual classification of papers and it still identifies
a significant set of concepts as we showed in (Ferrara
and Tasso, 2013). The idea of using more semantic
features is due to two main goals. First, our concept
based recommender system can explain why the sys-
tem recommended the documents by showing: (i) the
keyphrases which are both in the user model and in
the paper and (ii) other keyphrases found in the doc-
ument which are not yet stored in the user model but
can support the user in understanding/evaluating the
new paper. The explanation of recommendations by
means of keyphrases produces several benefits. First
of all, the user satisfaction can be increased since ex-
planations save his time: the user is not forced to
read the entire document in order to catch the main
contents of the paper. Second, the system allows the
users to take a look to the main concepts stored in the
user model. In this way, a user can explicitly eval-
uate his interest for the various concepts and can in-
crease or decrease his interest level for specific con-
cepts or even remove them from his profile. By al-
lowing users to provide this new feedback the system
can generate a more accurate user profile improving,
in this way, the accuracy of the recommendation pro-
cess and, consequently, the user satisfaction. In this
paper we show that these two goals can be reached
by providing, at the same time, accurate recommen-
dations.
The paper is organized as follows: Section 2 re-
views related work, a brief architectural overview of
the system is presented in Section 3, the proposed rec-
ommendation method is described in Section 4, the
evaluation performed so far is described in Section 5,
and Section 6 concludes the paper.
2 RELATED WORK
Several works in the literature deal with the problem
of finding relevant scientific literature, mostly from
an Information Retrieval perspective, such as in (Bol-
lacker et al., 2000), where CiteSeer is introduced.
However there are several authors who have taken
into account more personalization-based approaches
to the problem, leading to the creation of recom-
mender systems rather than search engines. Several
examples analyze the textual contents of scientific
papers in order to provide recommendations to re-
searchers. Some of them take into account specific
sections of the papers such as the bibliography which
can be used to build, navigate, and, moreover, mine
the citation graph (i.e. the directed graph in which
each vertex represents an academic publication and
each edge represents a citation from one publication
to another) in order to generate the recommendations.
For instance, the citation graph is browsed by the rec-
ommender system described in (Huynh et al., 2012),
where a set of liked papers is used as seed for navi-
gating the citation graph.
On the other hand, our work aims at extracting
from the papers the main ideas and concepts in or-
der to describe the user interests from a more seman-
tic perspective. Similarly, the feedback of the users
of social systems, such as CiteUlike and BibSonomy,
has been also used for identifying the concepts of in-
terests of researchers. The authors of (Jiang et al.,
2012), for example, extract the tags provided by the
users of CiteUlike for generating a dictionary which
can be used for identifying relevant concepts in the
abstracts of scientific publications. In (Ferrara and
Tasso, 2011), the tags of the users of BibSonomy are
instead exploited for discovering if the user may be in-
terested in several distinct Topics of Interest (ToI). In
this case a clustering mechanism is utilized for joining
together tags with similar meanings where the simi-
larity depends on the number of times two tags have
been applied to the same resource. Such tag clus-
ters allow to organize papers into different collections,
each one associated to a specific ToI for the single
user. Only opinions of users interested in a specific
ToI are then considered for computing recommenda-
tions. More specifically, resources labelled by tags
which are evaluated as more similar to the tags as-
sociated to a ToI are considered more relevant than
other resources, and resources bookmarked by users
more similar to the active user are more relevant than
others as well. The precision of these approaches de-
pends on the active participation of the users whereas
the content-based recommender system described in
this paper is solely based on the automatic extraction
of the main concepts from a scientific resource.
The textual content of scientific papers is also an-
alyzed in a concept-based recommender system pro-
posed in (Chandrasekaran et al., 2008), where authors
and papers are modeled by trees of concepts: using
the ACM Computing Classification System (CCS),
the authors trained a vector space classifier in order to
associate concepts of the CCS classifications to doc-
uments. The hierarchical organization of the CCS al-
lows the system to represent user interests and docu-
ments by trees of concepts. A user profile and a pa-
per representation are then compared by a tree edit-
PersonalizedRecommendationandExplanationbyusingKeyphrasesAutomaticallyextractedfromScientificLiterature
97