Comparing the results presented here to our INEX
2005 results (Dopichaj, 2006), note that the imple-
mentation used for the runs we submitted to INEX
differs from the version described in this paper: The
scores are not updated independently, so the patterns
applied later receive the scores already updated by
earlier patterns, which may lead to unintended inter-
actions.
Our INEX submissions for both the CO.Thorough
and the CO.Focused sub-tasks outperform the other
organizations’ submissions at high precision (cut-off
10 and 25) with generalized quantization (but the re-
sults are not statistically significant). The relative per-
formance drops dramatically after 50 results, indicat-
ing a problem with recall; as our baseline has the same
problems, we assume that this is not caused by the
patterns.
6 CONCLUSIONS
We have presented a framework that facilitates the
use of overlapping results to our advantage in a post-
processing step that can be applied to (almost) arbi-
trary retrieval results. We do this by way of structural
patterns, a generic means of exploiting simple context
information. We have seen in our evaluation that sev-
eral patterns can lead to significant improvements of
result quality.
We need to analyze further what the reasons are
for the current shortcomings of our approach, in par-
ticular, the lack of improvement when applying sev-
eral patterns simultaneously. We plan to investigate
more patterns, and whether propagating the score
changes further up or down the result tree can lead to
improvements. As far as improvements of the method
are concerned, we plan to incorporate additional in-
formation such as which query terms were matched
by the context nodes.
All the patterns we described in this paper were
created from manual inspection of context graphs, but
of course it would be useful to have tools that analyze
a document collection and some example queries to
find candidate patterns.
REFERENCES
Arvola, P., Junkkari, M., and Kek
¨
al
¨
ainen, J. (2005). Gen-
eralized contextualization method for XML information
retrieval. In Proc. CIKM 2005.
Denoyer, L. and Gallinari, P. (2006). The Wikipedia XML
Corpus. SIGIR Forum, 40(1).
Dopichaj, P. (2006). The University of Kaiserslautern at
INEX 2005. In Fuhr et al. (2006).
Eger, B. (2005). Entwurf und Implementierung einer XML-
Volltext-Suchmaschine. Master’s thesis, University of
Kaiserslautern.
Fuhr, N., Lalmas, M., Malik, S., and Kazai, G., editors
(2006). Proc. INEX 2005. Springer.
Fuhr, N., Lalmas, M., Malik, S., and Szl
´
avik, Z., editors
(2005). Proc. INEX 2004. Springer.
J
¨
arvelin, K. and Kek
¨
al
¨
ainen, J. (2002). Cumulated gain-
based evaluation of IR techniques. ACM Transactions on
Information Systems, 20(4):422–446.
Kamps, J., de Rijke, M., and Sigurbj
¨
ornsson, B. (2005). The
importance of length normalization for XML retrieval.
Information Retrieval, 8(4):631–654.
Kazai, G. and Lalmas, M. (2005). Notes on what to mea-
sure in INEX. In Proc. of the INEX 2005 Workshop on
Element Retrieval Methodology.
Kazai, G., Lalmas, M., and de Vries, A. P. (2004). The
overlap problem in content-oriented XML retrieval eval-
uation. In Sanderson, M., J
¨
arvelin, K., Allan, J., and
Bruza, P., editors, Proc. SIGIR 2004, pages 72–79. ACM.
Kek
¨
al
¨
ainen, J., Junkkari, M., and Arvola, P. (2005). TRIX
2004 – struggling with the overlap. In Fuhr et al. (2005),
pages 127–139.
Lee, D. L., Chuang, H., and Seamons, K. (1997). Docu-
ment ranking and the vector-space model. IEEE Soft-
ware, 14(2):67–75.
Malik, S., Kazai, G., Lalmas, M., and Fuhr, N. (2006).
Overview of INEX 2005. In Fuhr et al. (2006).
Michalewicz, Z. and Fogel, D. B. (2004). How to Solve
It: Modern Heuristics, chapter 13, pages 367–388.
Springer, 2nd edition.
Ogilvie, P. and Callan, J. (2005). Hierarchical language
models for XML component retrieval. In Fuhr et al.
(2005), pages 224–237.
Ram
´
ırez, G., Westerveld, T., and de Vries, A. P. (2006).
Using small XML elements to support relevance. In
Efthimiadis, E. N., Dumais, S. T., Hawking, D., and
J
¨
arvelin, K., editors, Proc. SIGIR 2006. ACM.
Salton, G., Allan, J., and Buckley, C. (1993). Approaches
to passage retrieval in full text information systems. In
Proc. SIGIR 1993, pages 49–58.
IMPROVING CONTENT-ORIENTED XML RETRIEVAL BY APPLYING STRUCTURAL PATTERNS
13