The index and query schema about XML
documents suggested in this article was conducted
1000 times by repeating the queries with the XPath
expression depth of 10 by using experiment data as
in Table 1, and the results are in Figure 4. As shown
in Figure 4, the accuracy was more than 94%,
though the number of primitives, attributes and texts
in expressions increases. The error rate was less than
6%, the reason is that with increased complexity of
documents, the number of nodes by level in XML
document trees increases. Thus, as the length of bit
stream that can express structural information of
each node exceeds 64 bits, query processing
operations fail, otherwise inappropriate results were
returned though query processing was successful.
Figure 4: Accuracy of retrieval result.
3.2 Response Time Experiment of
Query Processing
To experiment response time experiment of query
processing, queries and Shakespeare's plays (Bosak
Shakespeare Collection) used in INRIA and XRel‘s
comparison experiment in Yoshikawa and Amagasa
(2001) were adopted for experiment data and queries.
The queries have different complexity level as it
goes up from Q1 to Q6.
Figure 5: Response time of retrieval result.
As shown in Figure 5, the join operations were
made to search the results of the given queries for
the methods the existing INRIA and XRel
suggested, but the queries were already converted
into optimized forms before index file access in the
query method suggested in this paper, there was a
big difference from the operation results of INRIA
and XRel since there were no join operations
between tables that occur database access.
4 CONCLUSIONS
The index and query schema is the method of
searching structural information and data of XML
documents and storing it in the storage media such
as database which can manage a lot of information.
However, when XML documents are changed,
the numbers given to XML document trees should
be changed too. It means that there are advantages:
one is that bit stream values should be changed in
that case and the other is that search is possible after
bit streams are completed by visiting all the nodes of
XML documents. Again, another disadvantage is
that search time increases as index fine is
bigger. Therefore, future studies are needed for the
research about dynamic models of bit streams
without any changes in bit streams even in case of
changes in XML trees, and about the search
techniques that makes it possible to search structures
without visiting all the nodes of XML documents
trees. In addition, another study about expanding
index file structures is also necessary to cope with
the problem that database table becomes big as
structural information needed for XML document
indexing increases.
REFERENCES
Tuong Dao, 1998. An Indexing Model for Structured
Document to Support Queries on Content, Structure
and Attributes. In Processing of IEEE ADL. pp. 88-97.
Tova Milo, Dan Suciu, 1999. Index structures for Path
Expressions. In ICDT. pp. 277-295.
C.Zhang, J. Naughton, D. Dewitt, etc, 2001. On
Supporting Containment Queries in Relational
Database Management System. In ACM SIGMOD.
Shu-Yao Chien, Zografoula Vagena, Donghui Zhang, etc,
2002. Efficient Structural Joins on Indexed XML
Documents. In VLDB. pp. 263-274.
Masatoshi Yoshikawa, Toshiyuki Amagasa, 2001. XRel :
A Path-Based Approach to Storage and Retrieval of
XML Documents Using Relational Databases. In ACM
TOIT. pp. 110-141.
Chin-Wan Chung, Jun-ki Min, Kyu-seok Shim, 2002.
APEX : An Adaptive Path Index for XML Data.
In SIGMOD. pp. 121-132.
D. Floresc, D. Kossman, 1999. A Performance Evaluation
of Alternative Mapping Schemes for Storing XML
Data in a Relational Database. In Technical Report of
INRIA.
IMPLEMENTATION OF INDEX SCHEMA FOR XML DOCUMENTS BASED ON STRUCTURE OF DATABASE
405