Proposal Google
1.0≧ 102 0.394153 0.477867
0.9≧ 100 0.400983 0.467425
0.8≧ 97 0.401148 0.45557
0.7≧ 91 0.410052 0.437263
0.6≧ 78 0.400914 0.401373
0.5≧ 60 0.396087 0.35919
0.4≧ 34 0.37525 0.283838
0.3≧ 14 0.265936 0.18804
0.2≧ 7 0.186868 0.107945
0.1≧ 3 0.069683 0.039227
102
100
97
91
78
60
34
14
7
3
0
0.1
0.2
0.3
0.4
0.5
0.6
0
20
40
60
80
100
120
1.0≧ 0.9≧ 0.8≧ 0.7≧ 0.6≧ 0.5≧ 0.4≧ 0.3≧ 0.2≧ 0.1≧
Figure 1: Reranking results.
which Google returns poor results. Each MAP value
of the point that AP is less than 1.0 represents the ac-
curacy of each method when using all queries.
The point where our method exceeds Google’s AP
is less than 0.6. This means our method is fairly ef-
fective when Google’s result is not good.
5 CONCLUSIONS
We have shown that search results can be improved
by reranking them with various methods based on
Wikipedia features. Experimental results so far in-
dicate the following.
• Category expansion methods are more effective
than a method using a set of categories to which a
query originally belongs.
• Reranking results are improved by deeply related
words but not the number of related words.
• Basically simpler methods work better. However,
more sophisticated methods, that are based on lo-
cal weights of outlinks and inlinks, and global
weights of link co-occurrence and category, work
significantly well.
• Any Wikipedia feature works fairly well to im-
prove search results.
Moreover, it turned out that outlinks are much bet-
ter than inlinks to be used for weighting in our meth-
ods. This is interestingly quite contrary to the results
by Chernov et al. When extracting statistical informa-
tion from Wikipedia, we need to carefully choose an
effective model. For this, we think a machine learn-
ing technique like Sumida et al. (Sumida et al., 2008)
would be promising.
In the future research, we are going to extract
more useful data by using Wikipedia features and
classify data using the machine learning.
ACKNOWLEDGEMENTS
This work was supported by JSPS KAKENHI
(21500102).
REFERENCES
Chernov, S., Iofciu, T., Nejdl, W., and Zhou, X. (2006).
Extracting semantic relationships between wikipedia
categories. In Proc. of Workshop on Semantic Wikis
(SemWiki 2006). Citeseer.
Cilibrasi, R. et al. (2007). The google similarity distance.
IEEE Transactions on knowledge and data engineer-
ing, pages 370–383.
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E.,
Solan, Z., Wolfman, G., and Ruppin., E. (2002).
WordSimilarity-353 Test Collection.
Gabrilovich, E. and Markovitch, S. (2007). Computing se-
mantic relatedness using wikipedia-based explicit se-
mantic analysis. In Proc. of IJCAI-07, pages 6–12.
Hori, K., Oishi, T., Mine, T., Hasegawa, R., Fujita, H.,
and Koshimura., M. (2010). Related Word Extrac-
tion from Wikipedia for Web Retrieval Assistance. In
Proc. of ICAART 2010 vol.2, pages 192–199.
Ito, M., Nakayama, K., Hara, T., and Nishio, S. (2008).
Association thesaurus construction methods based on
link co-occurrence analysis for wikipedia. In Proc. of
the 17th ACM conference on Information and knowl-
edge management, pages 817–826. ACM.
Nakatani, M., Jatowt, A., Ohshima, H., and Tanaka,
K. (2009). Quality evaluation of search results
by typicality and speciality of terms extracted from
wikipedia. In Database Systems for Advanced Appli-
cations, pages 570–584. Springer Berlin/Heidelberg.
Nakayama, K., Hara, T., and Nishio, S. (2007). Wikipedia
mining for an association web thesaurus construction.
Web Information Systems Engineering–WISE 2007,
pages 322–334.
Page, L., Brin, S., Motwani, R., and Winograd, T. (1998).
The pagerank citation ranking: Bringing order to the
web.
Ponzetto, S. and Strube, M. (2006). Wikirelate! computing
semantic relatedness using wikipedia. In Proc. AAAI-
06, pages 1419–1424.
Schutze, H. and Pedersen, J. (1997). A cooccurrence-
based thesaurus and two applications to information
retrieval. Information Processing & Management,
33(3):307–318.
Sumida, A., Yoshinaga, N., and Torisawa, K. (2008). Boost-
ing precision and recall of hyponymy relation acqui-
sition from hierarchical layouts in wikipedia. In Proc.
of the LREC 2008.
Witten, I. and Milne, D.(2008). An effective, low-cost mea-
sure of semantic relatedness obtained from Wikipedia
links. In Proc. of AAAI Workshop on Wikipedia and
Artificial Intelligence: an Evolving Synergy, AAAI
Press, Chicago, USA, pages 25–30.
EVALUATING RERANKING METHODS USING WIKIPEDIA FEATURES
381