Towards a Comprehensive XML Benchamrk

Fatima Maktoum Al-Sedairi, Mohammed Al-Badawi and Abdallah Al-Hamdani

Department of Computer Science, Sultan Qaboos University, Muscat, Oman

Keywords: XML, XML Benchmark, XQuery Performance, XML/RDBMS Mapping.

Abstract: XML benchmarks are tools used for measuring and evaluating the performance of new XML developments

such as XML/RDBMS/OO mapping techniques and XML storages. With XML benchmarks, the evaluation

process is done by executing predefined query-set over the benchmark's dataset members; where the

performance of the new development is compared against the performance of some existing techniques. Yet,

none of the existing XML benchmarks seems directly investigated the effect of sought data location on the

query performance. This research is an attempt to investigate the rationale of adding the Data Dimension (DD)

to the 3D~XBench features. To achieve this, a new set of queries was added to the query-set of the

3D~XBench to test the effect of changing the location of the sought records. The preliminarily experimental

results have shown that the query execution time is also driven by the location of the sought data in the

underlying XML database; and therefore, the Data Dimension can be added to the existing features of the 3D

XML Benchmark.

1 INTRODUCTION

XML databases these days offer more effective way

to represent the dramatic increase in the Web data

(Roy and Ramanujan, 2000). As a result, the need for

developing efficient XML technologies for

representing this data is crucial, and has recently

become the intension of many researchers. Thus,

founding efficient and comprehensive evaluation

tools for new developments has become increasingly

important requirement in the field of XML database

management.

Specifically, XML benchmarks are devised to

mimic and test capabilities of particular types of

XML database management systems based on a

certain real-world scenario to get the performance

result that is very useful and essential for improving

DBMS technologies. Each of these benchmarks has a

role in addressing a number of distinct issues related

to the performance testing and evaluation of the new

XML developments (i.e., to test either the

application’s overall performance or to evaluate

individual XML functionalities of a specific XML

implementation). Therefore, each benchmark must be

representative of just some applications of XML

technology. However, most of the benchmarks are

focusing on testing data management systems and

query engines (Nicola et al., 2007; Mlynkova, 2008).

In terms of functionalities, the XML benchmarks

offer a set of queries each of which is designed to test

a particular primitive of the query processor or

storage engine (Nicola et al., 2007; Mlynkova, 2008;

Sakr, 2010). To this end, the researchers have

intended to use a comprehensive set of queries, which

cover the major aspects of query processing to get

reliable results. Yet, none of the existing XML

benchmarks seem directly and explicitly concerned

about testing the effectiveness of the location of

sought data on the query performance. This research

is based on extending the 3D XML Benchmark

(3D~XBench (Al-Badawi et al., 2010)) to cover this

aspect.

The rest of this paper is organized as following.

Section 2 presents a brief literature about some of the

existing XML benchmarks, while Section 3 describes

the 3DXBench in more details. Section 4 formulates

the 3DXBench extension and Section 5 provides

some experimental results to test the effectiveness of

the new extension. The paper is concluded in Section

2 RELATED WORK

The raise of XML importance and the dramatic

increase in the number of the XML new techniques

112

Al-Sedairi F., Al-Badawi M. and Al-Hamdani A..

Towards a Comprehensive XML Benchamrk.

DOI: 10.5220/0005474101120117

In Proceedings of the 11th International Conference on Web Information Systems and Technologies (WEBIST-2015), pages 112-117

ISBN: 978-989-758-106-9

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

have risen the need for tools to evaluate those new

developments. XML benchmarks are designed to

simulate and test capabilities of particular types of

XML data management systems based on a certain

real-word scenario in order to get the performance

result that is very useful and essential for improving

DBMS technologies.

Several XML benchmarks have been proposed in

the literature to help both researchers and users to

compare XML databases independently. Each of

these benchmarks is addressing a number of distinct

issues; to test either the application’s overall

performance or to evaluate individual XML

functionalities of a specific XML implementation. As

a result, each benchmark targets some applications of

XML technology. However, most of them focus on

testing data management systems and query engines.

To this end, the benchmarks offer a set of queries, and

each of which is intended to challenge a particular

primitive of the query processor or storage engine.

Thus, the researchers intended to use a

comprehensive set of queries, which covers the major

aspects of query processing. Moreover, the overall

workload consists of scalable XML databases with

specific aspects regarding the testing parameters of

each benchmark.

Generally, XML benchmarks should be simple,

portable, scaled to different workload sizes, and they

allow objective comparisons of competing systems

and tools (Gray, 1993). Although their applications

are diverse and complex, developing meaningful and

realistic benchmarks for XML is a truly a big

challenge for all XML researchers. In addition, XML

processing tools fall into many categories, from

simple storage services to sophisticated query

processors, and this adding to the complexity of

developing relevant and realistic XML benchmarks.

Considering the tradition research, researches in

XML benchmarks tend to compare the newly

proposed XML benchmarks with the existing ones, as

well as analysing the behaviour of a particular one

with various types of data. Also the literature

compares some of the existing XML benchmarks in

(Nicola et al., 2007; Mlynkova, 2008; Al-Badawi et

al., 2010; Sakr, 2010). Both dataset and query set are

the main criteria which were considered by all the

comparisons such as in (Al-Badawi et al., 2010;

Nicola et al., 2007). Furthermore, they investigated

the benchmarks from different aspects such as

benchmark type (micro, Application level), the

number of users, applications, schema and the key

parameters of testing data.

Finally, in terms of existing XML benchmarks,

the set includes XMark (Schmidt et al., 2002), XOO7

(Bressan et al., 2003), XBench (Yao et al., 2004),

XMach~1 (Böhme and Rahm, 2001), MBench

(Runapongsa et al., 2006), XPathMark (Franceschet,

2005), MemBeR (Afanasiev et al., 2005) and TPoX

(Nicola et al., 2007). Recently some new benchmarks

were added to the list including the 3D~XBench (Al-

Badawi et al., 2010), EXRT (Carey et al., 2011) and

Renda-RX (Zhang et al., 2011).

3 THE 3D XML BENCHMARK

The 3D~XBench was proposed to test the effect of

three XML document aspects on the XML query

performance. These are the depth, breadth and the

size of the underlying XML database and their

reflection on the XQuery syntax; hence is called the

3D~XBench. The depth defines the number of levels

in XML tree while the breadth represents the average

fanouts of XML nodes. The size is measured by the

number of nodes in the XML tree which is mainly

used in the scalability testing.

Like other benchmarks, the 3D~XBench

framework is based on executing a set of pre-defined

XML queries over a number of XML databases which

are selected carefully to reflect the three XML aspects

mentioned above. The following two subsections

explain more about the 3D~XBench’s dataset and

query-set respectively.

3.1 Dataset

Three different databases have been used in the

benchmark from different sources which are either

real or synthetic. The DBLP (DBLP, 2014) and the

TreeBank (PennProj, 2014) are real databases, while

the XMark (Schmidt et al., 2002) dataset is a synthetic

(code generated dataset). Dataset base members (the

original XML databases) are versioned two more

times at 50% and 25% of the base database to vary

the database size dimension. The other two

dimensions are varied naturally due the nature of the

used databases which were intentionally and carefully

selected to reflect the depth and breadth dimensions.

Figure 1 and Table1 depict the variation aspect of

the three dimensions over the benchmark’s dataset.

Table 1: Characterises of the 3D~XBench’s Dataset.

Size

(Base DB)

Max.

Depth

Avg.

Breadth

DBLP 2,439,294 6 11

XMark 2,437,669 11 6

TreeBank 2,437,667 36 3

TowardsaComprehensiveXMLBenchamrk

113

Figure 1: The 3D~XBench Architecture Design (Adopted

from al-Badawi M., et al. 2010).

3.2 Query Set Design

3D~XBench adopts the query-set used by the XMark

benchmark (Schmidt et al., 2002). The XMark’s 20

queries are grouped into 14 categories each of which

targets specific database querying functionality. Out

of XMark’s 20 queries, 3D~XBench adopts 10

queries only which descend from the following

categories:

 Exact Matching

 Order Access

 Path Traversal

 Sorting

 Aggregation

 Reg. path Exp.

 Missing Element

4 THE EXTENSION

None of the listed benchmarks, including the

3D~XBench, has taken care of investigating the

effectiveness of the sought data location on the query

performance. This research is the first step in that

direction.

The 3D~XBench extension is done by adding a

new set of queries to the query set of the 3D~XBench

to test the effect of changing the location of the sought

records. The Data-Dimension, whenever is linked

with other features of the 3D-XBench, is expected to

strengthen the benchmark’s testing capabilities and

make it a comprehensive testing model than ever

found in the literature. Figure 2 illustrates the new

extension graphically.

The main idea behind the extension (Data

Dimension) is to divide the base dataset into three

pre-set zones to test the effect of query performance

among three different locations. These zones are

determined by specific range of nodes from the root

node. The first zone of each database is restricted to

be within the first 30% of the database, while the

second zone starts from 45% to 75% and the third

zone comes after the 90%. Each query category is

executed among the three different zones (ranges) to

test the effect of the Data Dimension.

Adding the Data Dimension leads to almost the

same testing requirements (framework) as for the

original dimensions. For example, the dataset

remained the same, while the query set was altered to

include the new dimension (i.e. the Data Dimension).

As a result, the query-set selection considered only

those queries of which the location of the sought data

matters. In addition, some queries were modified to

adopt the new specifications.

Figure 2: Visualization of the New Extension of the 3D

XML Benchmark.

5 EXPERIMENTAL RESULTS

In order to evaluate the effectiveness of the new

extension on benchmark’s testing environment, the

extended 3D~XBench has been used to compare two

representatives mapping techniques from the

literature and observe whether the benchmark’s new

extension is going to produce a consistence

performance over different versions of XML queries.

5.1 Mapping Techniques Selection

The research followed the same evaluation process

that was used by (Al-Badawi et al., 2010) to evaluate

the 3D~XBench when introduced earlier. In that, the

evaluation process is based on using the

XML/RDBMS mapping environment and selects a

set of mapping techniques, which represent the

existing once in the literature. The selected set

includes the Edge (Florescu and Kossmann, 1999)

and XParent (Jiang et al., 2002) mapping techniques

to represent the single-relation and multiple-relations

mapping techniques. These two techniques were used

by (Al-Badawi et al., 2010) too. The relational

schema of each mapping technique was implemented

in FoxPro database engine and all dataset members

WEBIST2015-11thInternationalConferenceonWebInformationSystemsandTechnologies

114

(total 9 databases; 3 versions from each: DBLP,

XMark, and TreeBank) were mapped to both

schemas.

5.2 Query-set Selection

The previous evaluation process of the base

3D~XBecnch used 10 queries (expressed in XQuery

syntax as imported from (Schmidt et al., 2002)),

which were divided into 7 categories. The new

evaluation process adopted a subset of those

categories which, their workload can be affected by

the location of the sought data. So, the experiment

considered 2 categories with 2 different queries from

each. These are: the exact-matching category

(Shallow and Deep), and the Join-on-Value category

(join and join with filter).

The 4 queries in the query-set were translated over

the 9 dataset members using the 2 selected mapping

techniques. Three more versions, that each targets a

specific range in the underlying database, were

produced from each query in the query-set members.

The total number of queries in the query-set becomes

216.

5.3 Execution Conditions

The experiments was conducted in a stand-alone PC

(Intel®Xeon®CPU2.93GHz,6GBRAM), running

Windows7 (64 bit). Further, all XML databases and

XQuery queries were translated to the FoxPro

relational environment, and each query was executed

20 times against the concerned database; every time

the execution time is taken in mill-seconds. For the

validity, the experiment considered the average of the

middle 18 readings.

5.4 Preliminary Results

Due the space restrictions, this section presents only

a subset of the results obtained from the above

experiment. These results are illustrated by the

diagrams given at the end of this paper, along with a

short discussion as following.

Figure 3 shows that the Shallow Exact Matching

query was slower in the first range over the shallow

(DBLP) and average (XMark) databases in the single

mapping technique. It went slightly faster (about one

millisecond) in both databases over the other two

ranges. However, the Shallow Exact Matching query

was faster in the first range over the deep databases

than that of the other two ranges. In general, it seems

that the increasing Data-Dimension has an opposite

impact on the Shallow-exact matching query as far as

the single-relation mapping technique is considered.

Like in the single-relation mapping technique, DD

had the same effect over the deep and shallow

databases in the multiple-relations mapping

techniques (Figure 7). However, DD had an

inconsistent change over the average database.

Generally, in all cases the DD seems not much

affecting the Shallow Exact Matching query over all

the databases. More generally, it seems that DD has

less effect on the single-relation mapping technique

than the multiple-relation mapping technique as

illustrated in Figure 3 and 7.

When concerning the Deep Exact Matching

query, Figure 4 clearly shows that the deep exact

matching query gets slower when the search-range is

increased over the wide and deep XML databases, but

it gets faster over the average width/breadth database.

However, the difference in query performance was

much clear over the deep XML database (i.e. 39-

35=4, 45-39=6) while the difference was very narrow

over the wide and average-width database (i.e. the

difference was only one unit).

On the other hand, Figure 8 shows the DD effect

on the Deep Exact Matching query over the base

databases (DBLP, XMark, TreeBank) within multiple

mapping techniques (XPerent). The figure shows that

there is an inconsistent change over the average and

shallow databases. The elapsed time has a little

decrease over the deep databases. In brief, DD has a

minor effect on this query type over all database

categories for the both mapping techniques.

In terms of Join on Value queries, Figure 5

presents the effect of DD over the base databases

(DBLP, XMark, TreeBank) within single mapping

technique (Edge). It presents that DD caused a

consistent increasing elapsed time over the shallow

and average-width databases; while there was an

inconsistent change in the elapsed time over the deep

databases as shown in the Figure.

Similarly, Figure 9 shows the effect of DD on the

Join on Value query over the base databases (DBLP,

XMark, TreeBank) using multiple mapping technique

(XPerent). The effect of the DD within this query was

a minority over the deep databases, while it produced

an inconsistent elapsed time average-width over

database as illustrated in the Figure. The query time

was proportional few units when increasing search

range over the wide XML database.

Finally, Figures 6 and 10 show the elapsed time of

the “Join on Value with Range Filter” query. The

elapsed time of this query type was increasing over

the shallow databases in both mapping techniques.

This was also valid over the average-depth database

TowardsaComprehensiveXMLBenchamrk

115

but only for multiple-relations mapping technique.

However, it seems that the query performance was

not that much affected over the deep database for this

query type when executed over both mapping

techniques as seen Figures 6 and 10.

Figure 3: Edge, Base DB, Shallow Exact Matching.

Figure 4: Edge, Base DB, Deep Exact Matching.

Figure 5: Edge, Base DB, Join on Value.

Figure 6: Edge, Base DB, Join on Value with Filter.

Figure 7: XParent, Base DB, Shallow Exact Matching.

Figure 8: XParent, Base DB, Deep Exact Matching.

Figure 9: XParent, Base DB, Join on Value.

Figure 10: XParent, Base DB, Join on Value with Filter.

6 CONCLUSIONS

This paper discussed the rationale of extending the

functionalities of the 3D XML Benchmark (Al-

WEBIST2015-11thInternationalConferenceonWebInformationSystemsandTechnologies

116

Badawi et al., 2010) by adding a new feature, which

concerns about testing the effectiveness of the sought

data location on the query performance.

To evaluate the extension, a new experiment was

conducted using the same datasets as in the original

3DXBench, but with an expanded query set that

includes queries to test the effect of the DD. The

experiment used two representative mapping

techniques (one single-relation and one multiple-

relation mapping techniques).

The experimental results show that the Data

Dimension (DD) has a significant influence on the

query elapsed time with respect of database structure

(depth, breadth, size) and query categories. The

performance of different mapping approaches (single

vs. multiple) is also affected by DD. Thus, DD can be

included as the 4th dimension in the 3DX~Bench

Benchmark.

A further research can be carried out into different

directions. First, one can expand the valuation process

to test the effect of the DD on other query types

introduced in (Schmidt et al., 2002) such as the path

traversal, order access, sorting, aggregation, missing

elements and others. Also, a further evaluation for the

new extension can consider measuring other

experimental variables such as CPU usage, memory

consumption and I/O operations. Moreover, the

experiment can be conducted over different mapping

technique like PACD and/or native XML databases.

ACKNOWLEDGEMENTS

This research is supported by The Research Council

(Oman) under the Grant RC/SCI/COMP/12/01.

REFERENCES

Afanasiev, L., Manolescu, I., and Michiels, P., 2005.

MemBeR: A Micro-Benchmark Repository for

XQuery. In Proceedings of 3rd International XML

Database Symposium, LNCS. Springer-Verlag.

Al-Badawi, M., North, S., and Eaglestone, B., 2010. The

3D XML Benchmark. In the proceedings of the

WebIST’10, pp. 13-20, Valencia, Spain.

Böhme, T. and Rahm, E., 2001. XMach-1: a Benchmark for

XML Data Management. In Datenbank systeme in

BuroTechnik und Wissenschaft, 9. GI-Fachtagung,

pp.264–273, Springer-Verlag, London, UK.

Bressan, S., Lee, M-L., Li, Y.G., Lacroix, Z. and Nambiar,

U., 2003. The XOO7 benchmark. In Proceedings of

VLDB 2002 Workshop EEXTT and CAiSE 2002

Workshop, pp.146–147, London, UK.

Carey, M. J., Ling, L., Nicola, M., and Shao, L., 2011. Exrt:

Towards a simple benchmark for xml readiness testing.

In Performance Evaluation, Measurement and

Characterization of Complex Systems, pp. 93-109.

Springer Berlin Heidelberg.

DBLP, 2014. The DBLP Website, Available at

http://dblp.uni-trier.de/, [Accessed on 13/116/2014].

Franceschet., M., 2005. XPathMark: an XPath benchmark

for XMark generated data. International XML

Database Symposium (XSYM).

Florescu, D., and Kossmann, D., 1999. A Performance

Evaluation of alternative Mapping Schemas for Storing

XML Data in a Relational Database. INRIA

Rocquencourt, France

Gray J., 1993. The Benchmark Handbook for Database and

Transaction Systems. Morgan Kaufmann, San

Francisco, CA, USA, ISBN 1-55860-292-5.

Jiang, H., Lu, H., Wang, W., and Yu, J., 2002. XParent: An

Efficient RDBMS-Based XML Database System.

International conference on Data Engineering, pages 1-

2, CA, USA.

Mlynkova, I,. 2008. XML Benchmarking. IADIS

International Conference Informatics, pages 59- 66.

Nicola, M., Kogan, I., and Schiefer, B., 2007. An XML

Transaction Processing Benchmark. In Proceedings of

ACM SIGMOD international conference on

Management of data (pp. 937-948).

PennProj, 2014. The Penn Treebank Project. Available

online at http://www.cis.upenn.edu/~treebank/,

[Accessed on: 13/11/2014].

Roy, J., and Ramanujan, A., 2000. XML: data's universal

language. IT Professional, 2(3), 32-36.

Runapongsa, K., Patel, M., Jagadish, H., Chen, Y., and Al-

Khalifa, S., 2006. The Michigan Benchmark: Towards

XML Query Performance Diagnostics. International

Journal of Information systems, 31(2), pages 73-97.

Sakr, S., 2010. Towards a comprehensive assessment for

selectivity estimation approaches of XML queries.

International Journal of Web Engineering and

Technology, 6(1), 58-82.

Schmidt, A., Waas, F., Kersten, M., Carey, D., Manolescu,

I., and Busse. R., 2002. XMark: A Benchmark for XML

Data Management. International conference on VLDB,

Hong Kong, pages 1-12.

Yao, B. B., Ozsu, M. T., and Khandelwal, N., 2004.

XBench benchmark and performance testing of XML

DBMSs. In IEEE Proceedings, 20

International

Conference on Data Engineering (pp. 621-632).

Zhang, X., Liu, K., Zou, L., Du, X., and Wang, S., 2011.

Renda-RX: A Benchmark for Evaluating XML-

Relational Database System. LNCS, Volume 6897,

2011, pp 578-589.

TowardsaComprehensiveXMLBenchamrk

117