XML IR. The proposed approach apply a re-ranking
process upon initially retrieved XML elements, by ev-
idential combining both XML scores (computed link-
based and initial scores). This evidential combina-
tion is based on the use of the Demspter–Shafer the-
ory of evidence. Our approach exploits both inter-
nal and external links to build specific “element-to-
element” links. It introduces a new parameter, called
“link weight”, in the link score computation. By us-
ing the theory of evidence, it combines scores of both
bodies of evidence in order to re-rank XML retrieval
results. Our proposals are evaluated under the INEX
Wikipedia test collection. The results showed im-
provement compared to baseline and “Topical Pager-
ank” approach in most of topics. This means that
combining link evidence using DS theory with its
content evidence outperforms the content-based ap-
proach. In future work, we aim to address the behav-
ior of the proposed approach using some Dempster’s
alternative rules upon multiple systems retrieval re-
sults.
ACKNOWLEDGEMENTS
Special thanks to all the people who supported this
research, particularly SIG team members of IRIT In-
stitute, France.
REFERENCES
Wikipedia: The free encyclopedia. 2013. http://en.
wikipedia.org/.
Brin, S. and Page, L. (1998). The anatomy of a large-scale
hypertextual web search engine. Computer networks
and ISDN systems, 30(1):107–117.
Dempster, A. P. (1967). Upper and lower probabilities in-
duced by a multivalued mapping. The annals of math-
ematical statistics, pages 325–339.
Denoyer, L. and Gallinari, P. (2007). The wikipedia xml
corpus. In Comparative Evaluation of XML Informa-
tion Retrieval Systems, pages 12–19. Springer.
Dopichaj, P., Skusa, A., and Heß, A. (2009). Stealing an-
chors to link the wiki. In Advances in Focused Re-
trieval, pages 343–353. Springer.
Fachry, K. N., Kamps, J., Koolen, M., and Zhang, J. (2008).
Using and detecting links in wikipedia. In Focused
access to XML documents, pages 388–403. Springer.
Farahat, A., LoFaro, T., Miller, J. C., Rae, G., and Ward,
L. A. (2006). Authority rankings from hits, pager-
ank, and salsa: Existence, uniqueness, and effect of
initialization. SIAM Journal on Scientific Computing,
27(4):1181–1201.
Fuhr, N. and Großjohann, K. (2001). Xirql: A query lan-
guage for information retrieval in xml documents. In
Proceedings of the 24th annual international ACM SI-
GIR conference on Research and development in in-
formation retrieval, pages 172–180. ACM.
Geva, S., Kamps, J., Lethonen, M., Schenkel, R., Thom,
J. A., and Trotman, A. (2010). Overview of the inex
2009 ad hoc track. In Focused retrieval and evalua-
tion, pages 4–25. Springer.
Geva, S., Trotman, A., and Tang, L.-X. (2009). Link discov-
ery in the wikipedia. Pre-Proceedings of INEX 2009.
G
¨
overt, N. and Kazai, G. (2002). Overview of the initia-
tive for the evaluation of xml retrieval (inex) 2002. In
INEX Workshop, pages 1–17. Citeseer.
Guo, L., Shao, F., Botev, C., and Shanmugasundaram, J.
(2003). Xrank: ranked keyword search over xml docu-
ments. In Proceedings of the 2003 ACM SIGMOD in-
ternational conference on Management of data, pages
16–27. ACM.
Itakura, K. Y., Clarke, C. L., Geva, S., Trotman, A., and
Huang, W. C. (2011). Topical and structural linkage
in wikipedia. In Advances in Information Retrieval,
pages 460–465. Springer.
Jenkinson, D., Leung, K.-C., and Trotman, A. (2009).
Wikisearching and wikilinking. In Advances in Fo-
cused Retrieval, pages 374–388. Springer.
Kamps, J. and Koolen, M. (2008). The importance of link
evidence in wikipedia. In Advances in Information
Retrieval, pages 270–282. Springer.
Kimelfeld, B., Kovacs, E., Sagiv, Y., and Yahav, D. (2007).
Using language models and the hits algorithm for xml
retrieval. In Comparative Evaluation of XML Infor-
mation Retrieval Systems, pages 253–260. Springer.
Kleinberg, J. M. (1999). Authoritative sources in a hy-
perlinked environment. Journal of the ACM (JACM),
46(5):604–632.
Lalmas, M. and Ruthven, I. (1998). Representing and
retrieving structured documents using the dempster-
shafer theory of evidence: Modelling and evaluation.
Journal of Documentation, 54(5):529–565.
Lempel, R. and Moran, S. (2001). Salsa: the stochastic ap-
proach for link-structure analysis. ACM Transactions
on Information Systems (TOIS), 19(2):131–160.
Mataoui, M., Mezghiche, M., and Boughanem, M. (2010).
Exploiting link evidence to improve xml information
retrieval. In Proceeding de la Confrence Interna-
tionale sur l’Extraction et la Gestion des Connais-
sances Maghreb (EGC-M), pages 23–33. ESI.
Pehcevski, J., Vercoustre, A.-M., and Thom, J. A. (2008).
Exploiting locality of wikipedia links in entity rank-
ing. In Advances in Information Retrieval, pages 258–
269. Springer.
Schocken, S. and Hummel, R. A. (1993). On the use of the
dempster shafer model in information indexing and
retrieval applications. International Journal of Man-
Machine Studies, 39(5):843–879.
Shafer, G. (1976). A mathematical theory of evidence, vol-
ume 1. Princeton university press Princeton.
Verbyst, D. and Mulhem, P. (2009). Using collectionlinks
and documents as context for inex 2008. In Advances
in focused retrieval, pages 87–96. Springer.
Zhang, J. and Kamps, J. (2008). Link detection in xml doc-
uments: What about repeated links. In SIGIR 2008
Workshop on Focused Retrieval, pages 59–66.
Evidential-Link-basedApproachforRe-rankingXMLRetrievalResults
71