PROTECTING LEGACY APPLICATIONS FROM UNICODE

Erik Wilde

Swiss Federal Institute of Technology (ETH)

urich, Switzerland

Keywords:

XML, CRVX, XML validation, Unicode, DSDL

Abstract:

While XML-based Web Service architectures are successfully turning the Web into an infrastructure for

cooperating applications, not all problems with respect to interoperability problems have yet been solved.

XML-based data exchange has the ability to carry the full Unicode character repertoire, which is approaching

100’000 characters. Many legacy application are being Web-Service-enabled rather than being re-built from

scratch, and therefore still have the same limitations. A frequently seen limitation is the inability to handle the

full Unicode character repertoire. We describe an architectural approach and a schema language to address this

issue. The architectural approach proposes to establish validation as basic Web Service functionality, which

should be built into a Web Services architecture rather than applications. Based on this vision of modular an

infrastructure-based validation, we propose a schema language for character repertoire validation. Lessons

learned from the ﬁrst implementation and possible improvements of the schema language conclude the paper.

1 INTRODUCTION

For many applications today, the Extensible Markup

Language (XML) (Bray et al., 2000) is the way to ex-

change data, either through proprietary mechanisms,

or using some XML-bases standards such as Web Ser-

vices. In either case, the actual data is exchanged

using Unicode (Unicode Consortium, 2000), because

XML is based on Unicode. XML allows different

character encodings, but the only character encodings

that must be supported by all XML implementations

are UTF-8 and UTF-16. These two encodings are ca-

pable of encoding the full Unicode character reper-

toire, so that XML documents may contain any Uni-

code character.

In many cases, the applications implement-

ing XML-based interfaces are not fully Unicode-

compliant and require some sort of protection from

receiving documents using the full Unicode charac-

ter repertoire. In these cases, it is necessary to only

forward documents using supported character ranges

to the application, while XML documents containing

unsupported characters should be rejected or at least

trigger some kind of exception in the workﬂow.

In principle, ﬁltering out unwanted documents in

an XML-based environment is the job of schema

languages. While XML’s built-in Document Type

Deﬁnition (DTD) schema language does not support

datatypes, the more recent XML Schema (Fallside,

2001) supports the concept of datatypes, including

regular expressions and unicode character classes.

Thus, Web Services based on XML Schema may de-

ﬁne some restrictions of character repertoires. How-

ever, there are two major problems with this ap-

proach:

• Monolithic Design: XML Schema is a heavy

schema language integrating many different con-

cepts such as a type system for element and at-

tribute types, identity constraints for specialized

co-constraints, a library of simple datatypes, and

defaulting mechanisms for elements and attributes.

Application developers requiring only character

repertoire capabilities may be overwhelmed by

XML Schema’s complexity.

• Incomplete Support: XML Schema provides

datatype support for attribute values and element

content (excluding mixed content). This means that

character repertoire restrictions cannot be deﬁned

for mixed content, element and attribute names,

comments, and processing instructions.

In this paper, we present an approach overcom-

ing these two limitations. We present a specialized

144

Wilde E. (2004).

PROTECTING LEGACY APPLICATIONS FROM UNICODE.

In Proceedings of the First International Conference on E-Business and Telecommunication Networks, pages 144-151

DOI: 10.5220/0001388701440151

 SciTePress

schema language which is speciﬁcally designed for

character repertoire validation. This makes it easier

to use this schema language, in particular in scenarios

where complex schema support is not yet required.

The approach of modular validation presented in Sec-

tion 2.1 is based on this perspective of XML valida-

tion as a sequence of processing steps, possibly per-

formed at different nodes of an XML-based workﬂow.

In Section 2.2 we give a more general description of

our perspective on XML processing and how the Web

Service world evolves towards a network-like infras-

tructure.

In Section 3 we describe the schema language for

character repertoire validation in greater detail. The

schema language is purely declarative and can be im-

plemented on top of different XML technologies.

2 XML VALIDATION AND WEB

SERVICES

According to (Booth et al., 2004), “a Web Service is

a software system designed to support interoperable

machine-to-machine interaction over a network. It

has an interface described in a machine-processable

format. Other systems interact with the Web service

in a manner prescribed by its description using SOAP-

messages, typically conveyed using HTTP with an

XML serialization in conjunction with other Web-

related standards.”

For our purposes, the most important observations

are that Web Services are based on the exchange of

XML messages, and that the architecture is in no

way restricted to a certain topology. Speciﬁcally, the

well-known architectural concepts of computer net-

works, using concepts such as bridges, routers, gate-

ways, ﬁrewalls, and proxies, can be employed to de-

scribe Web Service architectures. In Section 2.1, we

discuss how this networked exchange of XML-based

messages can be used to process XML in a modular

and distributed way. Section 2.2 goes one step further

and explains how this distributed processing of XML-

based messages can be used to establish the concept

of Web Services networks.

2.1 Modular Validation

For many XML users today, validation is a one-step

process. Using a schema (in most cases a DTD or an

XML Schema), an XML document is validated and is

either successfully validated or classiﬁed as invalid.

While this view of validation often is true, it is possi-

ble to look at validation in a modular way. In this sce-

nario, validation is a modular process involving dif-

ferent schemas, which are checking different facets

of an XML document.

For example, in the Web Service architecture of a

company it might make sense to ﬁrst check all incom-

ing document against a schema that restricts the al-

lowed character repertoire, because it is known that

characters outside this repertoire are not supported

within the company’s workﬂow and thus should be

rejected. If a Web Service request is valid with re-

spect to the character repertoire, it is validated against

the XML Schema for SOAP messages in general, and

then against the XML Schema for the speciﬁc SOAP

message type. Finally, before the SOAP message

is passed to the application, it is validated against a

message-speciﬁc schema, for example checking that

some identiﬁcation number inside the SOAP mes-

sage corresponds to an existing item in the company’s

database.

The scenario discussed in the previous paragraph

enables us to make some interesting observations and,

in particular, generalizations regarding XML valida-

tion:

• Distributed Validation: The validation steps de-

scribed above could be performed at different

points within the company, for example the charac-

ter repertoire and XML Schema validation could be

performed on a central SOAP intermediary, while

the application-speciﬁc validation takes place on

the system implementing the SOAP service.

• Increasing Speciﬁcity: Typically, multiple valida-

tion steps increase the data quality by ﬁltering

out unwanted documents. While character reper-

toire validation is rather generic, the lookup in a

database for some identiﬁer is very speciﬁc and can

only be performed by an application-speciﬁc vali-

dation step.

• Application-Speciﬁc Validation: As mentioned al-

ready, validation does not always have to be

generic. Application-speciﬁc validation provides

the beneﬁts of describing assertions declaratively,

making it easier to maintain and modify the

application-speciﬁc schema. Application-speciﬁc

validation also makes it easier for developers to

encapsulate assertions and separate them from the

code processing SOAP calls. (Nentwich et al.,

2003) discuss this in depth from a software engi-

neering point of view.

Based on these observations, it can be concluded

that a more modular way of dealing with validation

can be advantageous compared to the monolithic ap-

proach advertised by the more traditional approach

to validating XML. The Document Schema Deﬁni-

tion Languages (DSDL) currently under development

by ISO is one example of a modular approach to-

wards XML validation. It is based on a collection

of schema languages, and a framework for specifying

how to validate documents against possibly multiple

PROTECTING LEGACY APPLICATIONS FROM UNICODE

145

schemas. However, DSDL will probably not provide

built-in support for distributed validation.

It would be possible to look at validation as a Web

Service itself, but this would make validation a rather

expensive task. Instead, validation should be viewed

as some kind of processing that is applied to SOAP

messages in transit. This perspective brings us to the

view of Web Services as something very similar to

computer networks, an analogy we explore in the fol-

lowing section.

2.2 Web Services Networks

For computer networks, a number of standard net-

working devices are well-known and established, and

composing networks out of bridges, routers, gate-

ways, ﬁrewalls, and proxies is common knowledge.

For Web Services, such a standard set of technologies

and devices has not yet evolved, and we argue that

many of the metaphors known from computer net-

working can be reused in the ﬁeld of Web Services.

(Wilde and Steiner, 2004; Jeckle and Wilde, 2004)

give a more detailed discussion of the similarities of

traditional (i.e., network-oriented) and Web Service

protocol stacks, and describe how some of these simi-

larities can be exploited to use existing experience and

design patterns to build Web Service architectures.

As a concrete example, (Riggs, 2003) describes

the concept of Data Quality Firewalls, which are de-

vices that should be placed strategically to prevent or

at least detect the inevitable loss of data quality in a

workﬂow where many different peers are collaborat-

ing. From the XML and Web Services point of view,

these ﬁrewalls could be implemented as devices capa-

ble of performing XML validation. They could then

be conﬁgured to ﬁlter out SOAP messages which are

violating the data quality standards (which are deﬁned

by schemas). The analogy between regular (i.e., com-

puter network level) and Web Services ﬁrewalls is evi-

dent, both devices have built-in knowledge of the data

to be processed, and both devices can be conﬁgured to

use this knowledge for classifying passing data.

In the current version of SOAP (Gudgin et al.,

2003), the notion of SOAP Intermediaries has already

been established. This is a big step forward towards

the vision of Web Services Networks. However, there

still remain many gaps to be ﬁlled. An interesting

approach is described by (Melzer and Jeckle, 2003).

They have implemented a gateway that signs SOAP

messages (encryption support is planned to follow).

This could be compared to the setup of a Virtual Pri-

vate Network (VPN), where the intranet is a trusted

network, but network trafﬁc leaving the intranet is en-

crypted.

The concept of modular validation could be built

into different devices in a Web Services Networks.

SOAP gateways could be conﬁgured to perform val-

idation (thus becoming “validating intermediaries”),

and validation could also be integrated into develop-

ment frameworks, providing a clear separation be-

tween the validation part of a Web Services (the as-

sertions to be checked before the actual service in-

vocation), and the service implementation. Modu-

lar validation is one facet of looking at Web Service

from the perspective of modeling them after network-

ing protocol stacks, and the interesting similarity be-

tween “validating intermediaries” and traditional ﬁre-

walls is that both components should be conﬁgured

rather than programmed, so that re-conﬁguration can

be done rather quickly.

3 CRVX

In the preceding section, we describe why and how

modular validation can be useful. As one step in a val-

idation pipeline, checking for character repertoires is

a useful functionality (it is listed as one of the areas of

the DSDL framework), and the Character Repertoire

Validation for XML (CRVX) (Wilde, 2003a; Wilde,

2003b) language described here can be used to per-

form this kind of validation.

3.1 XML Information Models

The set of XML information models is constantly

growing, with diverse members such as XML 1.0

itself, the XML Information Set (Cowan and To-

bin, 2001), various versions of the Document Object

Model (DOM), and the XPath 1.0 (Clark and DeRose,

1999) and 2.0 (Fern

andez et al., 2003) information

models. Some of these models have rather subtle dif-

ferences, others have differences that may be very im-

portant for some applications. One example for this is

the lack of support for CDATA sections in the XPath

information models.

To avoid this diversity of different perspectives on

XML, CRVX remains close to the XML 1.0 model,

but completely ignores the physical structures of an

XML document. CRVX works on top of XML’s log-

ical structures, and consequently does not have any

means of speciﬁcally addressing the entity structure

of an XML document.

However, there is one additional aspect outside of

XML 1.0 that is supported by CRVX, and this is

the issue of XML Namespaces (Bray et al., 2004a).

Namespaces are very popular with XML applications,

and the usage of Namespaces implies some addi-

tional constraints for XML documents. Since Name-

spaces introduce a new perspective on XML docu-

ment, most notably by structuring names into preﬁxes

and local names, and by introducing Namespace dec-

larations as a special kind of attribute, they are ex-

ICETE 2004 - GLOBAL COMMUNICATION INFORMATION SYSTEMS AND SERVICES

146

Listing 1:

<crvx structures="namespaceXML" version="1.0"

xmlns="http://dret.net/xmlns/crvx10">

...

</crvx>

Listing 2:

<crvx structures="namespaceXML" version="1.0"

xmlns="http://dret.net/xmlns/crvx10">

</crvx>

Listing 3:

<crvx structures="namespaceXML" version="1.0"

xmlns="http://dret.net/xmlns/crvx10">

</crvx>

Listing 4:

<crvx structures="namespaceXML" version="1.0"

xmlns="http://dret.net/xmlns/crvx10">

<restrict structure="elementLocalName attributeLocalName PITarget"

charrep="\p{IsBasicLatin}"/>

<restrict structure="elementContent"

charrep="\p{IsBasicLatin} \p{IsLatin-1Supplement}"/>

</crvx>

Listing 5:

<crvx structures="namespaceXML" version="1.0"

xmlns="http://dret.net/xmlns/crvx10">

</context>

</crvx>

Listing 6:

<crvx structures="namespaceXML" version="1.0"

xmlns="http://dret.net/xmlns/crvx10">

</context>

</crvx>

Figure 1: CRVX Examples

plicitly supported by CRVX, making it possible to

interpret a document either as pure XML (it must

be well-formed) or as Namespace XML (it must be

namespace-well-formed).

Listing 1 shows how the information model to be

used for validation is selected in CRVX. It also shows

the general skeleton of an CRVX schema, which is

an XML document. The three attributes of the docu-

ment element specify the XML information model to

be used for the CRVX schema, the version of CRVX,

and declare the CRVX namespace (in this case using

a default namespace declaration).

3.2 Restrictions

The purpose of CRVX is to specify character reper-

toire restrictions, which are then used to validate

XML documents. Restrictions in CRVX have four

properties, the character repertoire, the minimum

and/or maximum length of character sequences, the

structural parts of an XML documents the restrictions

PROTECTING LEGACY APPLICATIONS FROM UNICODE

147

should apply to, and the context in which these re-

strictions should be validated:

• Character Repertoire: The character repertoire of

an CRVX restrictions uses mechanisms from XML

Schema Datatypes (Biron and Malhotra, 2001) to

specify character classes. These character classes

can have two forms, which are (1) a character class

expression (for example ‘[b-y]’ for the charac-

ter set from b to y)

, and (2) a category escape (for

example ‘\p{Ll}’ for all lowercase letters). Cat-

egory escapes may also be used inside character

class expressions, but since category escapes will

probably be used very often for specifying CRVX

constraints, they have been made a top-level con-

struct in CRVX.

The category escapes use values from the Unicode

Character Database (UCD), in particular values

from the general category and blocks. General Cat-

egories are used for classifying characters, such as

letters, number, symbols, or punctuation charac-

ters. Blocks are arbitrary names for ranges of code

points and are often used for grouping characters

according to their source, for example a certain lan-

guage (or family of languages) or application area.

Listing 2 shows how blocks can be used in CRVX.

In this example, it can also be seen that char-

acter repertoire can be combined by using XML

Schema’s list type, thus separating different char-

acter repertoire restrictions into tokens separated

by whitespace. If the charrep attribute speci-

ﬁes a list (i.e., contains multiple token separated by

whitespace), then these are combined using a log-

ical “or”, so that the resulting character repertoire

effectively is the union of the character repertoires

speciﬁed by the list tokens.

• Lengths: It may be a validation requirement closely

related to character repertoires to limit the length

of certain structures of an XML document. Conse-

quently, it is possible to restrict the minimum and

maximum lengths. In most cases, this feature will

be used in combination with limiting the restric-

tion to certain structures, which is why the example

shown in Listing 3 combines these two features.

• Structure: In many cases, restrictions do not ap-

ply to all characters appearing in an XML docu-

ment, but only to certain structural parts, such as

element or attributes names, attribute values, or el-

ement content. As described in Section 3.1, CRVX

supports two XML information models, which are

pure XML and Namespace XML. The structural

parts of an XML document that are accessible

Character class expressions may contain single char-

acter escapes (for example ‘\-’ for a hyphen), and multi-

character escapes (for example ‘\i’ for the initial name

characters).

in both models are element content, CDATA sec-

tions, attribute values, processing instruction tar-

gets, processing instruction contents, and com-

ments. For pure XML, the additional structures are

element names and attribute names. For Name-

space XML, the additional structures are element

local names, attribute local names, namespace

names, and namespace preﬁxes. Listing 3 shows

how to use structures for a restriction.

In this example, the actual character repertoire re-

mains unrestricted, but the maximum length of ele-

ment and attribute local names is restricted to eight

characters. This example also demonstrates an ad-

vantage over XML Schema, which can apply sim-

ple type restrictions only to typed parts of an XML

document (i.e., attribute values and character-only

element content), but not to other structural parts.

• Context: In some cases, it may be desirable to

limit the restrictions to certain parts of an XML

documents, for example to check only the charac-

ter repertoire of text that appears as descendant of

some given element. CRVX supports this kind of

application as context, but since there are several

ways of using contexts, they are described sepa-

rately in Section 3.3.

It is thus possible to use different orthogonal di-

mensions for deﬁning restrictions. Since users in

many cases want to combine these dimensions dif-

ferently, restrictions can be combined to yield more

complex CRVX schemas, as shown in Listing 4.

It should be noted that this CRVX schema deﬁnes

two restrictions for processing instruction targets, one

(together with element and attribute names) for re-

stricting targets to basic latin (i.e., ASCII) characters,

and the other one for requiring that they must have

three characters. However, even by using multiple re-

strictions, they still always apply to the entire docu-

ment scope, and the following section describes how

this can be changed by using CRVX contexts.

3.3 Contexts

A context in CRVX is deﬁned by an XSLT pattern.

Consequently, contexts may only be used with Name-

space XML, since the whole set of XSLT/XPath rec-

ommendation is based on an information model that

requires namespace-well-formed XML. Contexts are

deﬁned using a special CRVX element, as shown in

Listing 5.

This example shows various features of context

deﬁnitions. Context deﬁnitions may contain restric-

tions, in which case the restriction only applies to

the context selected by the context’s path attribute.

Contexts may be nested, in which case the nested con-

text is evaluated in the context in which it is speci-

ﬁed. To make contexts reusable, they may also carry

ICETE 2004 - GLOBAL COMMUNICATION INFORMATION SYSTEMS AND SERVICES

148

a name attribute, which can then be referenced from

restrictions or contexts with a within attribute, thus

allowing context hierarchies to be directed acyclic

graphs instead of trees.

Since contexts are based on Namespace XML and

use XSLT patterns which include element and/or at-

tribute names, they may also use namespaces. Name-

spaces are declared using a special element, which is

described in the following section.

3.4 Namespace Declarations

Namespace declarations are speciﬁed using a spe-

cial CRVX element. They may only appear as direct

children of the crvx element. All namespace pre-

ﬁxes used in path attributes of context elements

must be declared using a special namespace ele-

ment (i.e., it is not sufﬁcient or required that they are

declared as namespaces in the CRVX schema docu-

ment). Listing 6 shows how a namespace is declared

and used in CRVX.

4 CRVX IMPLEMENTATION

CRVX is a rather simple schema language, and its

main advantage over performing character repertoire

validation within program code is that it is declarative.

This means that an CRVX author only speciﬁes what

should be checked for, but not how it should be done.

The actual validation is the task of an CRVX imple-

mentation. Similarly to Schematron, a CRVX imple-

mentation can be programmed rather easily by trans-

forming a CRVX schema into an XSLT stylesheet.

This approach is described in Section 4.1. However,

the XSLT-based approach has some disadvantages,

such as performance issues and the inability to deal

with non-namespace-compliant XML. Thus, alterna-

tive implementations could be based on existing XML

parsers, and this second approach is described in Sec-

tion 4.2

4.1 Based on XSLT

CRVX uses XML as its notation and thus can be pro-

cessed using XSLT. It is thus possible to write an im-

plementation of CRVX that uses XSLT as a tools to

transform an CRVX schema into an XSLT stylesheet,

and then uses this stylesheet to perform CRVX val-

idation. Such an implementation is described in the

CRVX speciﬁcation (Wilde, 2003a). The advantage

of this approach is that it does not require any spe-

cial software, the only thing that is required is an

XSLT processor. The disadvantages are that a XSLT

2.0 (Kay, 2003) processor is required, because only

XSLT 2.0 supports the regular expression matching

that is required to perform CRVX validation.

Other

disadvantages are that due to the information model

of XSLT 2.0, there

• is no way to process non-namespace-compliant

XML documents with an CRVX implementation

based on XSLT, and

• it is impossible to support the structural part

of CDATA sections, because XSLT’s information

model does not support CDATA sections.

Given these limitations, an XSLT-based implemen-

tation of CRVX is rather simple. Contexts can be im-

plemented using template rules and nested contexts

can be implemented using looping. The restrictions

themselves can be implemented using regular expres-

sion matching, with lengths being mapped to XML

Schema regular expression quantiﬁers. The differ-

ent structural parts can be accessed using XQuery

1.0 and XPath 2.0 Functions and Operators (Mal-

hotra et al., 2003), which provides functions such as

local-name() for evaluating the local name of an

element or attribute.

4.2 Based on XML Processor

While an XSLT-based implementation is sufﬁcient for

applications that can live with the rather poor per-

formance of an XSLT stylesheet and the limitations

that are caused by the underlying XSLT, more ef-

ﬁcient and feature-complete CRVX implementations

will need to use another foundation. Most likely, an

XML processor will be used. Popular interfaces for

writing software based on XML processors are the

Document Object Model (DOM) and the Simple API

for XML (SAX). The DOM3 XPath Module (Whitmer,

2003) would make it rather easy to implement con-

texts, while SAX does not support this kind of func-

tionality and thus would require more programming

effort to implement contexts.

5 FURTHER WORK

CRVX has been designed as a small schema language

and as a starting point. It is useful for basic validation

tasks for character repertoires, but is also limited in

a number of ways. The following points have been

identiﬁed as possible extensions of CRVX:

• Treatment of Nested Restrictions: Currently, nested

restrictions are treated as a logical union, so that the

restrictions of a nested part of an XML document

It should be noted that XSLT 2.0 is in working draft

status and thus may still change. Also, there are only few

implementations, which also may change.

PROTECTING LEGACY APPLICATIONS FROM UNICODE

149

are the union of all restrictions applying nodes fur-

ther up the XML document tree structure. This im-

plicit way of joining nested restrictions could be

made explicit and variable, for example allowing

overwriting of restrictions.

• Character Reference Restrictions: Characters in

XML may appear literally or as character refer-

ence. Most XML information models do not distin-

guish between these two forms and pass the charac-

ter to the application. However, it may be required

to disallow the use of character references or, on

the contrary, force characters from certain charac-

ter ranges to appear as character references only.

• XPath 2.0: The XPath expression supports in

CRVX currently are based on XPath 1.0. XPath

2.0 deﬁnes a more powerful language for address-

ing parts of an XML document (which may include

type information, if the type information for a doc-

ument can be inferred from some schema).

• Character Normalization: Unicode deﬁnes canon-

ical and compatibility equivalences between char-

acters or sequences of characters. For processing

purposes, it can be useful to normalize a document.

XML (including the emerging XML 1.1 (Bray

et al., 2004b)) does not require character normal-

ization, so that normalization checks are necessary

for all documents which are not certiﬁed as being

in normalized form (D

urst et al., 2003).

While these improvements are targeting CRVX, the

surrounding infrastructure also need to evolve, before

visions such as distributed and transparent validation

in a Web Services network can be implemented in an

easy way. In particular, the ideas of validation and

SOAP intermediaries need to be merged to yield val-

idating intermediaries, ideally conﬁgurable in a fully

declarative way.

6 CONCLUSIONS

Web Services and the deployment of existing services

through Web Service interfaces still are young disci-

plines. The schema language and the architectural

approach to Web Service networks presented in this

papers are contributions which hopefully help to in-

crease the success of Web Service technologies. The

major hurdle for Web Service deployment today is se-

curity, and if the Web Service standards evolve to a

point where using an encrypting SOAP intermediary

is as easy and natural as setting up a VPN, then Web

Services will become a commodity as computer net-

works are today.

ACKNOWLEDGEMENTS

I would like to thank Diederik Gerth van Wijk and

Martin Bryan for their comments regarding CRVX

and possible improvements.

REFERENCES

Biron, P. V. and Malhotra, A. (2001). XML Schema Part

2: Datatypes. World Wide Web Consortium, Recom-

mendation REC-xmlschema-2-20010502.

Booth, D., Haas, H., McCabe, F., Newcomer, E., Cham-

pion, M., Ferris, C., and Orchard, D. (2004). Web

Services Architecture. World Wide Web Consortium,

Note NOTE-ws-arch-20040211.

Bray, T., Hollander, D., Layman, A., and Tobin, R.

(2004a). Namespaces in XML 1.1. World Wide Web

Consortium, Recommendation REC-xml-names11-

20040204.

Bray, T., Paoli, J., Sperberg-McQueen, C. M., and Maler,

E. (2000). Extensible Markup Language (XML) 1.0

(Second Edition). World Wide Web Consortium, Rec-

ommendation REC-xml-20001006.

Bray, T., Paoli, J., Sperberg-McQueen, C. M., Maler,

E., Yergeau, F., and Cowan, J. (2004b). XML

1.1. World Wide Web Consortium, Recommendation

REC-xml11-20040204.

Clark, J. and DeRose, S. J. (1999). XML Path Language

(XPath) Version 1.0. World Wide Web Consortium,

Recommendation REC-xpath-19991116.

Cowan, J. and Tobin, R. (2001). XML Information

Set. World Wide Web Consortium, Recommendation

REC-xml-infoset-20011024.

urst, M. J., Yergeau, F., Ishida, R., Wolf, M., and Texin,

T. (2003). Character Model for the World Wide Web

1.0. World Wide Web Consortium, Working Draft

WD-charmod-20030822.

Fallside, D. C. (2001). XML Schema Part 0: Primer.

World Wide Web Consortium, Recommendation

REC-xmlschema-0-20010502.

Fern

andez, M. F., Malhotra, A., Marsh, J., Nagy, M., and

Walsh, N. (2003). XQuery 1.0 and XPath 2.0 Data

Model. World Wide Web Consortium, Working Draft

WD-xpath-datamodel-20031112.

Gudgin, M., Hadley, M., Mendelsohn, N., Moreau, J.-

J., and Frystyk Nielsen, H. (2003). SOAP Version

1.2 Part 1: Messaging Framework. World Wide

Web Consortium, Recommendation REC-soap12-

part1-20030624.

Jeckle, M. and Wilde, E. (2004). Identical Principles,

Higher Layers — Modeling Web Services as Protocol

Stack. In Proceedings of XML Europe 2004, Amster-

dam, Netherlands.

Kay, M. (2003). XSL Transformations (XSLT) Version 2.0.

World Wide Web Consortium, Working Draft WD-

xslt20-20031112.

ICETE 2004 - GLOBAL COMMUNICATION INFORMATION SYSTEMS AND SERVICES

150

Malhotra, A., Melton, J., and Walsh, N. (2003). XQuery

1.0 and XPath 2.0 Functions and Operators. World

Wide Web Consortium, Working Draft WD-xpath-

functions-20031112.

Melzer, I. and Jeckle, M. (2003). A Signing Proxy for

Web Services Security. In Tolksdorf, R. and Eckstein,

R., editors, Berliner XML Tage 2003, pages 292–304,

Berlin, Germany.

Nentwich, C., Emmerich, W., Finkelstein, A., and Ellmer,

E. (2003). Flexible Consistency Checking. ACM

Transactions on Software Engineering and Methodol-

ogy, 12(1):28–63.

Riggs, S. (2003). Data Quality and XML Validation. In

Proceedings of XML Europe 2003, London, UK.

Unicode Consortium (2000). The Unicode Standard: Ver-

sion 3.0. Addison Wesley, Reading, Massachusetts.

Whitmer, R. (2003). Document Object Model (DOM) Level

3 XPath Speciﬁcation. World Wide Web Consor-

tium, Candidate Recommendation CR-DOM-Level-

3-XPath-20030331.

Wilde, E. (2003a). Character Repertoire Validation for

XML (CRVX) Version 1.0. Technical Report TIK-

Report No. 172, Computer Engineering and Networks

Laboratory, Swiss Federal Institute of Technology,

urich, Switzerland.

Wilde, E. (2003b). Validation of Character Repertoires

for XML Documents. In Proceedings of the Twenty-

fourth Internationalization and Unicode Conference,

Atlanta, Georgia.

Wilde, E. and Steiner, A. (2004). Networking Metaphors

for E-Commerce. Technical Report TIK-Report No.

190, Computer Engineering and Networks Labora-

tory, Swiss Federal Institute of Technology, Z

urich,

Switzerland.

PROTECTING LEGACY APPLICATIONS FROM UNICODE

151