2 RELATED WORK
We assume a distributed XML processing paradigm,
where intermediate nodes are capable of XML pro-
cessing. Recent support for this paradigm is as fol-
lows. Active Network (Tennenhouse and Wether-
all, 2007) also focuses on processing at intermediate
switches. The processing manner is assumed to be of
two types: type-1) encapsulated into a packet, type-2)
assigned to network switches beforehand. The type-
1 manner executes only simple processing but is able
to execute fast processing because of hardware execu-
tion. The type-2 manner, which is more similar to our
system, is able to process more complex XML tasks.
VNode (Y. Kanada and Nakao, 2012) also provides
a processing environment at intermediate switches.
In their research, the processing function is also pro-
vided at customized switches and its processing envi-
ronment is provided as a virtualized environment us-
ing a virtual machine. These researches provide not
only transport functions but also processing functions
in the network.
Next we show current researches of specific pro-
cessing in networks. In transcoding (S. H. Kim and
Ro, 2012), a content server delivers data (e.g. video
data) to clients via a transcoding server. The transcod-
ing server transforms the original data to data which
reflects user’s demands. For instance, the data may be
transformed from high resolution to low resolution at
the transcoding server to adapt to mobile devices. An-
other intermediate node processing is a cache server
(S. Nishimura and Ikenaga, 2012; Kalarani and Uma,
2013), a key technology of content delivery net-
works. Cache servers are allocated to wide distributed
places and store contents as cache from other con-
tent servers, driven by user request patterns. Upon
user’s request, data is delivered from the nearest cache
servers leading to lower network latency. (Fan and
Chen, 2012; Solis and Obraczka, 2006) focus on sen-
sor networks. These researches propose to consol-
idate the large amount of sensing data at some in-
termediate nodes before large data reach data col-
lection servers. Such approach can reduce energy
consumption and network load for mobile sensor de-
vices. (M. Shimamura and Tsuru, 2010) studies the
compression of packets near a sender, expanding the
packets near a receiver during buffer queueing time to
achieve better network resource utilization. In these
works, we see that the network provides special func-
tions such as video transformation, data caching and
so on for specific services.
Finally, few papers have addressed network rout-
ing problem for efficient XML processing. (Ziyaeva
and Min, 2008) addresses the problem of routing
XML content to appropriate recipients within an En-
terprise Service Bus (ESB), where specific XML pro-
cessing is required. However, in contrast to our work,
XML processing is executed only at the recipient’s
site. In (Wang and Ozsu, 2007), the authors address
the problem of routing XML queries in an efficient
way over a large Peer-to-Peer network. The objec-
tive there is to best satisfy XML queries at destination
nodes, as opposed to XML execution at intermediate
nodes.
3 DISTRIBUTED XML
PROCESSING
Distributed XML processing requires some basic
functions to be supported:
• Document Partition: The XML document is di-
vided into fragments, to be processed at process-
ing nodes.
• Document Annotation: Each document frag-
ment is annotated with current processing status
upon leaving a processing node.
• Document Merging: Document fragments are
merged so as to preserve the original document
structure.
XML processing nodes support some of these tasks,
according to their role in the distributed XML sys-
tem. XML document processing involves stack data
structures for tag processing. When a node reads a
start tag, it pushes the tag name into a stack. When a
node reads an end tag, it pops a top element from the
stack, and compares the end tag name with the popped
tag name. If both tag names are the same, the tags
match. The XML document is well-formed when all
tags match. In addition, in validation checking, each
node executing grammar validation reads DTD files,
and generates grammar rules for validation checking.
Each node processes validation and well-formedness
at the same time, comparing the popped/pushed tags
against grammar rules. Details of these node dis-
tributed processing is described in (Cavendish and
Candan, 2008; Y. Uratani and Oie, 2012).
3.1 XML Routing
An overlay XML network presents networking nodes
with various XML processing capacities. One way
to capture node processing capabilities is to define a
node processing capacity of X XML tags per second
(C = X/sec). That being the case, a routing problem
may be defined as follows. For each XML document
WEBIST 2019 - 15th International Conference on Web Information Systems and Technologies
266