2 RELATED WORK
To our knowledge, not much research has been per-
formed to improve web search using explicit Social
Signals such as likes and tags. However, related work
has been performed in the field of the Semantic Web.
The Semantic Web aims at adding logic to the World
Wide Web. The idea behind this is that the Web be-
comes better readable for machines. This way, ma-
chines would be able to get a better understanding of
how pages are related to each other and where they
have to look for certain information. Furthermore,
the Semantic Web enables machines to aggregate data
from different pages and present this aggregated data
to users in a clear overview (Berners-Lee et al., 2001).
A truly personal approach to Information Re-
trieval on the WWW has been taken by Delicious.
On Delicious people can create an account, add book-
marks to it and retrieve those bookmarks later on
based on tags that can be assigned to bookmarks.
They can also befriend people and search in the book-
marks of their friends. Several studies were per-
formed on whether such an approach could improve
web search and the results differed (Heymann et al.,
2008; Yanbe et al., 2007; Noll and Meinel, 2007).
Bookmarking on Delicious is a form of collabo-
rative tagging. Golder and Huberman performed re-
search in this field of study and they define collabora-
tive tagging as
”the process by which many users add meta-
data in the form of keywords to shared con-
tent” (Golder and Huberman, 2006).
During their research they observed that people use
a great variety of tags, but also consensus is reached
in such a way that stable patterns emerge in tag pro-
portions with respect to tagged resources. They also
identify the main reason behind tagging, which is per-
sonal use. They conclude that the stable patterns in
tagging can be used to organise and describe how web
resources relate to each other. Tags can be seen as a
form of Social Signals that could be taken into ac-
count in determining the relative importance of Web-
pages. Not all Social Signals assign words to a re-
source. Social Signals can be less complex, such as a
like. A like only indicates positivity with respect to,
for example, a web resource.
In 2007, Bao, Wu, Fei, Xue, Su and Yu saw the
potential of social annotations to determine the value
of Webpages (Bao et al., 2007). Although they took
a different approach with their ranking method that
they call SocialPageRank, the idea is rather similar
to the Social Score method as proposed in this pa-
per. One differences between the approaches are that
SocialPageRank makes use of more complex mathe-
matical calculations whereas the Social Score makes
use of simpler math and is easier to understand. Fur-
thermore, the computational complexity of the So-
cial Score method to calculate the Social Score of
one Webpage is O(1) whereas in the SocialPageR-
ank method it is not possible to calculate any individ-
ual Score for a Webpage without calculating the other
scores for the other Webpages. This is because Social-
PageRank makes use of recursive Matrix multiplica-
tions just like PageRank does to converge to a stable
scoring model. In each iteration the computational
complexity is O(|U ||W | + |s||W | + |U||s|) where |U|
is the number of users U of the Social Media platform,
|W | is the number of Webpages W in a Corpus C and
|s| is the number of social annotations or Social Sig-
nals. The number of iterations determines the accu-
racy of the resulting scores for the Webpages. The last
and most important difference is that SocialPageRank
only makes use of data from one Social Media plat-
form what leaves more open space for bias. The So-
cial Score method is more generic and can take into
account as many Social Signals from as many Social
Media Platforms as desired.
3 SOCIAL SCORE
Just like PageRank, the Social Score is used next to
existing techniques like TF-IDF. That is what makes
the algorithm so similar to Pagerank: it calculates the
value of a Webpage, completely independent of any
query. Additional algorithms are required for both
PageRank and the Social Score to actually use this
information in search engines, because also the query
has to be taken into account. Only if algorithms such
as TF-IDF result in many hits, which is often the case
on the Web, the Social Score can be used to determine
which resources should be returned first. In oppo-
site to PageRank, the Social Score can be calculated
for every Webpage individually, without having to re-
compute all the other scores for all other resources.
An arbitrary number of Social Signals from different
Social Media Platforms can be taken into account. To
prevent bias towards a certain group of internet users
or a certain domain, it is good practice to take as many
signals from as many Social Media Platforms as pos-
sible into account. To calculate the Social Score S for
a Webpage W , we take into account n Social Signals
related to Webpage W . The Social Score takes into
account a list L of n Social Signal Scores s, where s
is the number of Social Signals from one Social Me-
dia Platform. For example, s could be the number of
shares of a Webpage W on Facebook or the number
of tweets about W on Twitter. Now the Social Score
KDIR2014-InternationalConferenceonKnowledgeDiscoveryandInformationRetrieval
72