that need most of or even the entire document dur-
ing processing, e.g., queries using the descendant-or-
self axis on the document root, the a priori analysis
of the query yields little or no benefit since all nec-
essary parts of the document need to be loaded into
memory as a whole. Our solution allows to dynam-
ically load and unload DOM subtrees during XPath
evaluation. Hence, even if the parts of the document
necessary for evaluating a certain XPath expression
do not fit into memory as a whole, we are still able to
correctly process the document within the available
memory boundaries.
Streaming Transformations for XML
(STX) (Cimprich et al., 2007) allow the trans-
formation of large, theoretically infinite XML
documents or XML data streams with bounded
memory requirements. STX processes the XML data
in a streaming fashion. The limitation of memory
consumption during the processing of XPath expres-
sions results from a limitation of the allowed XPath
expressions to a subset of XPath that is suitable for
streaming evaluation. Thus, the evaluation does not
require the buffering of a possibly infinitely large
internal state. Compared to STX, our approach is
aimed at supporting the full functionality of any
DOM-based application.
Other optimization approaches in the context of
XPath processing focus on processing performance
instead of memory consumption. A special problem
in this context is the quick and efficient evaluation
of a large set of XPath expressions on a sequence
of XML documents as needed in XML-based pub-
lish&subscribe systems.
8 CONCLUSIONS & OUTLOOK
In this paper, we introduced the LazyDOM as an ap-
proach to limit the memory requirements of DOM-
based XML processing and to potentially increase the
performance of DOM loading. The LazyDOM uses
the concept of selfContained elements defined in the
Efficient XML Interchange (EXI) format to divide
a DOM into fragments and to load or unload these
fragments on demand during DOM processing. The
approach is transparent to DOM-based applications,
i. e., no changes need to be made to applications to
support the LazyDOM instead of a traditional DOM.
Any DOM-based application such as XPath proces-
sors, XQuery processors, XSLT processors, XML
Schema parsers and validators, etc. can be used with
the LazyDOM. An indexing mechanism allows to ef-
ficiently jump to the parts of the EXI encoded XML
document that need to be loaded.
Our measurement results show that the LazyDOM
is able to drastically reduce memory consumption
during DOM-based XML processing. The amount
of memory needed for processing and the processing
performance depend highly on the use case and the
configuration of the LazyDOM and the DOM-based
applications. We have outlined generic design guide-
lines that promise to yield good results when followed
in practice.
Topics for future work include investigating var-
ious cache replacement strategies and their applica-
bility for partial DOM unloading, further investiga-
tions concerning suitable LazyDOM configurations
for specific use cases especially with respect to the
identification of suitable selfContained elements in
the DOM, and exploiting schema knowledge.
REFERENCES
Bournez, C. (2009). Efficient XML Interchange
Evaluation. http://www.w3.org/TR/2009/WD-exi-
evaluation-20090407/. W3C Working Draft.
Busatto, G., Lohrey, M., and Maneth, S. (2005). Effi-
cient memory representation of xml documents. In
Database Programming Languages, 10th Interna-
tional Symposium, Trondheim, Norway, pages 199–
216. Springer.
Cimprich, P., Becker, O., Nentwich, C., Jirouˇsek, H.,
Batsis, M., Brown, P., and Kay, M. (2007).
Streaming Transformations for XML (STX) Version
1.0. http://stx.sourceforge.net/documents/spec-stx-
20070427.html. Working Draft.
Clark, J. and DeRose, S. (1999). XML Path Language
(XPath) Version 1.0. http://www.w3.org/TR/xpath.
W3C Recommendation.
Cowan, J. and Tobin, R. (2004). XML Information Set (Sec-
ond Edition). http://www.w3.org/TR/xml-infoset/.
W3C Recommendation.
Ferraiolo, J., Jun, F., and Jackson, D. (2003). Scal-
able Vector Graphics (SVG) 1.1 Specifica-
tion. http://www.w3.org/TR/2003/REC-SVG11-
20030114/. W3C Recommendation.
Hors, A. L. and H´egaret, P. L. (2004). Document
Object Model (DOM) Level 3 Core Specification.
http://www.w3.org/TR/DOM-Level-3-Core/. W3C
Recommendation.
Kim, S. M., Yoo, S. I., Hong, E., Kim, T. G., and Kim,
I. K. (2007). A document object modeling method to
retrieve data from a very large xml document. In King,
P. R. and Simske, S. J., editors, ACM Symposium on
Document Engineering, pages 59–68. ACM.
Marian, A. and Sim´eon, J. (2003). Projecting xml doc-
uments. In Proceedings of the 29th International
Conference on Very Large Data Bases (VLDB), pages
213–224, Berlin, Germany.
Paparizos, S., Patel, J. M., and Jagadish, H. V. (2007).
Sigopt: Using schema to optimize xml query process-
ing. In ICDE, pages 1456–1460. IEEE.
Schneider, J. and Kamiya, T. (2009). Effi-
cient XML Interchange (EXI) Format 1.0.
http://www.w3.org/TR/2009/CR-exi-20091208/.
W3C Candidate Recommendation 08 December
2009.
LazyDOM - Transparent Partial DOM Loading and Unloading for Memory Restricted Environments
105