users. In short, the recommended page is suitable for
the user and not commonly known.
3 FEATURES OF OUR
PROPOSAL
We have already launched a system which gathers
and provides web browsing activities of users into
a single centralized server. They use our pilot web
browser based on IE component or our Firefox
extension to send their activities in real-time. The
gathered activities are provided as the rankings of
currently and heavily viewed web pages, and thus
they can be regarded as ”recommendation”. We
also showed the usefulness of the system through
a pilot experiment by student users in our earlier
study(K.Maruyama,K.Takasuka,Y.Yagihara,M.Satoshi,
Y.Shirai,M.Terada, 2006).
The existing system has the following features:
1. uses only browsing activities,
2. recommends web pages without analysis of web
page contents
3. and retrieves users’ activities from their web
browsers directly.
The recommendation system in this paper has four
features:
1. collaborative filtering,
2. implicit build of user profiles,
3. exclusion of popular web pages
4. and use of the real activity in our university.
Collaborative filtering, unlike contents based one,
can recommend URLs which don’t contain any texts
such as images. The implicit build of user profiles re-
duces users’ effort in the explicit one, because users’
interest may change in short term when they see web
pages with some interesting hyper links at the upper
part of them.
4 RELATED WORKS
Web page recommendation has two problems inher-
ent in web itself: the number of web pages and the lo-
cation of web servers. Most of e-commerce services
provide recommendations to their customers in order
to increase their sales. E-commerce services such as
Amazon.com(Linden et al., 2003) provide millions of
items, but all web servers in the world have tens of bil-
lions of web pages. In addition, each e-commerce ser-
vice has all the items to be provided in its own servers,
but web pages, in contrast, are located at so many dis-
tributed web servers in the world.
Li et al.(Jia Li and Osmar R. Za
¨
ıane, 2004) pro-
posed a web page recommendation system with col-
laborative filtering. It accepts access logs of a web
server as its input, analyzes the contents of the ac-
cessed web pages and the behavior of users, and then
produces recommended web pages. However, it can
be applied only to a particular web site because of the
use of access logs.
In contrast, the web page recommendation system
proposed by Zhu et al.(R.Greiner, T.Zhu, G.Haubl,
K.Jewell, 2005) can recommend web pages at all web
servers in the world. A special web browser for the
system enables it, but requires user to evaluate web
pages explicitly. As mentioned above, implicit evalu-
ation of web pages is expected.
Another approach to web page recommendation is
to use bookmarks of users because a bookmark of an
user shows his or her interests (Rucker and Polanco,
1997) (Jung et al., 2001). If an user kept his or her
bookmark up to date, the bookmark would reflect his
or her short term interests. Updating bookmarks cor-
responds explicit evaluation.
5 OUTLINE OF ALGORITHM
The algorithm for generating recommendation is ex-
plained in 4 steps.
step 1: 5 URLs are extracted from the history of the
target user.
step 2: A group of users who have viewed one or
more of the 5 URLs in step 1 is extracted out of
all users.
step 3: The similarities between target user and each
member of the group extracted in step 2 are com-
puted and the group is ordered by the result. From
the history of the most similar user we get candi-
dates for the recommendation.
step 4: Computing the page score of each candidates,
we determine the recommendation.
The outline of the algorithm is illustrated in Figure 1.
5.1 Recommender Group
In step 1, our system gets the latest five URLs from
the history of the target user. We use latest URLs
in order to reflect the short-term interest of the target
user.
In step 2, a set of users who share one or more of
the 5 URLs is extracted, which we refer the recom-
mender group in this paper.
WEBIST 2007 - International Conference on Web Information Systems and Technologies
448