significantly the execution time; it represents less
than 1% of the overall execution time. These
preliminary results validate our approach.
0
5
10
15
20
25
30
35
1234 6789
DS3 Q02
DS3 Q03
DS2 Q02
DS2 Q03
0
5
10
15
20
25
30
35
1234 6789
DS3 Q02
DS3 Q03
DS2 Q02
DS2 Q03
words
Time (ms)
Figure 4: q02 and q03 evaluation for DS2 and DS3.
Figure 4 illustrates the index search time
difference between q03 and q02. Due to the
identifiers ordering scheme, q02 always executes
faster than q03 as fewer elements are considered in
the search space.
6 CONCLUSION
In this paper, we have reported on the integration of
XQuery Text in an XML mediator. The main
difficulty is to integrate sources with little
capabilities in full-text search. We propose to use
indexed virtual views to support such sources. The
views are indexed inside the mediator using a sort of
structural dataguide derived from the view
definition, called a viewguide. Nodes identifiers and
path expressions are encoded through the viewguide,
which yields to algorithms to process efficiently the
mediator basic selection operator involving XPaths
and keywords. A parameterized ranking formula
taking into account relevance and deepness of
elements is proposed to integrate result relevance.
Further work remains to be done. Notably, a
better support of source capabilities would be
desirable. When a source can support a subset of
XQuery, we should be able to build limited views at
the wrapper to integrate it in distributed query
processing. Thus, functionalities should be divided
in multiple stages, e.g., concrete local views
combined with global virtual views. Also, local
ranking of results from a view or a capable source
(e.g., Google) seems easy, but global ranking with
pertinent formulas remains to be experienced in
details on real applications. The propagation of
updates must also be studied. Indexing structures
should be automatically updated when inserting and
deleting objects in data sources. A basic approach
could be detecting updates at wrapper level and
propagate them at the different index structures.
REFERENCES
Abiteboul S., S. Cluet, G. Ferran et M.C. Rousset: "The
Xyleme project", Computer Networks 39(3): 225-238
(2002)
Amer-Yahia S., C. Botev, J. Shanmugasundaram :
"TeXQuery: A Full-Text Search Extension to
XQuery", WWW'04
BEA: "Liquid data for WebLogic 1.1, 2004, http://e-
docs.bea.com/liquiddata/docs11/
Bremer J. M., M. Gertz : "XQuery/IR: Integrating XML
Document and Data Retrieval", WebDB 2002.
Buxton S., Rys M. Editors, "XQuery and XPath Full-Text
Requirements", W3C Working Draft 02 May 2003,
http://www.w3.org/TR/xquery-full-text-requirements/
Chen Q., A. Lim and K.W. Ong : D(k)-index: An adaptive
structural summary for graph-structured data. In Proc.
of SIGMOD, 2003.
Chung Chin-Wan, J. Min and K. Shim: "APEX: an
adaptive path index for XML data", SIGMOD
Conference 2002: 121-132
Cooper B., N. Sample, M.J. Franklin, G.R. Hjaltason and
M. Shadmon :" A Fast Index for Semistructured
Data.", VLDB 2001: 341-350
Dang-Ngoc T.-T., G. Gardarin : "Federating
heterogeneous data sources with XML", In Proc. of
IASTED IKS Conference, pages 193-198, Scottsdale,
USA, Nov. 2003.
Fuhr N., K. Großjohann: "XIRQL: A Query Language for
Information Retrieval in XML Documents". SIGIR
2001: 172-180
Gardarin G., L. Yeh: "Treeguide Index: Enabling Efficient
XML Query Processing", Bases de Données Avancées,
Montpellier, Octobre 2005
IBM: "DB2 Information Integrator for Content", 2004,
http://www-306.ibm.com/software/data/eip/
Kaushik R., P. Shenoy, P. Bohannon and E. Gudes :
Exploiting local similarity for indexing paths in graph-
structured data. In Proc. of ICDE, 2002.
Lin G., F. Shao, C. Botev, J. Shanmugasundaram :
XRANK: Ranked Keyword Search over XML
Documents. SIGMOD Conference 2003: 16-27
Milo T., D. Suciu: "Index Structures for Path
Expressions", ICDT 1999: 277-295
Papakonstantinou Y., V. Borkar, M. Orgiyan, K.
Stathatos, L. Suta, V. Vassalos, P. Velikhov : "XML
queries and algebra in the Enosys integration
platform", Data Knowl. Eng. 44(3): 299-322 (2003)
Rahm E., P.A. Bernstein. 2001. A survey of approaches to
automatic schema matching. VLDB journal:334-350.
Theobald A., G. Weikum : "The Index-Based XXL Search
Engine for Querying XML Data with Relevance
Ranking". EDBT 2002: 477-495
Widom J. et. al.: "Lore, a DBMS for XML", http://www-
db.stanford.edu/lore/
XQuare: "The XQuare project: open source information
integration components based on XML and XQuery",
2004, http://xquare.objectweb.org/
EXTENDING AN XML MEDIATOR WITH TEXT QUERY
45