
 
Unlike XML Schema and Schematron, XincaML 
is designed as a constraint specification language 
rather than a schema language. Its constraint 
expressions are more descriptive and declarative 
than those of Schematron, so business rules that 
applications need to check can be mapped to XML 
data constraints more easily. XincaML concentrates 
on descriptively expressing inter-node constraints 
that XML Schema can not express. Hence, it is 
considered as a helpful supplement of XML 
Schema.   
As a constraint specification language, XincaML 
focuses more on descriptiveness. It not only makes 
XincaML more like a natural language but also 
enables XML developers to write more optimized 
code for efficient constraint handling and to play 
with the constraint definition structure itself when 
needed. In addition, XincaML also gives users the 
flexibility of applying XPath to XincaML to the 
extent they like so that they can balance between a 
concise expression and a descriptive one. 
A XincaML Processor reference implementation 
is already available for downloading from IBM 
Alphaworks [Ying Nan Zuo, 2002]. It is a Java 
package and provides APIs for constraints parsing 
and checking. Applications are able to concentrate 
on data processing by delegating the data validation 
work to the processor. The violation handling 
mechanism of the processor, which enables 
callbacks of the application specific code for 
violation handling, helps application developers 
create cleaner program logic. 
  In the rest of this paper, we’ll first introduce 
the basic concepts of XML data constraint, and then 
discuss how XincaML expresses the inter-node 
constraints and its advantages. The reference 
implementation of XincaML Processor and several 
usage scenarios are also introduced so as to give a 
basic idea of how XML developers integrate 
XincaML into their applications. Some future works 
are presented in the end of the paper. 
2 XML DATA CONSTRAINTS 
Handling data constraints has been around for quite 
sometime. In a database, data constraints are mostly 
part of the database schema. The schema serves for 
two purposes. First, it describes the structure or type 
of the data; second, it describes certain constraints 
including assertion of the keys and inclusion 
dependencies. In general, all constraints on data can 
be divided into two groups-integrity constraints and 
data validity constraints. Integrity constraints (type 
constraints, path constraints etc.) describe semantic 
integrity of data. Data validity constraints describe 
conditions of validity of data.  [Ekaterina Pavlova, 
2000] 
Semi-structured data is a generation of structured 
data in a sense, so it has integrity constraints and 
data validity constraints similar to those in structured 
data. XML data is usually treated as semi-structured 
data, thus the constraints in semi-structured data can 
mostly be applied to XML data. In practice, most of 
real-world logical constraints to data are very 
complex and not just pure integrity constraints or 
data validity constraints. It is impossible to make a 
complete taxonomy of all these constraints. But 
some kinds of constraints are most commonly used 
by lots of XML applications. It is more valuable to 
investigate these kinds of constraints. 
In general, the commonly used XML data 
constraint can be classified as the following four 
categories: 
i.  Containment structural constraint 
(structures): This kind of constraint describes the 
basic structure of XML documents such as element 
hierarchies, attributes of a element, inheritance for 
elements and attributes, cardinality of elements and 
so on. 
ii.  Lexical structural constraint (data types): 
This kind of constraint describes data types and data 
formats in order to check the domain range of values 
of elements or attributes as well as ensure they 
follow certain formats. 
iii.  Integrity constraint (identity constraint): 
This kind of constraint describes the reference 
relationship between elements or attributes like the 
key/foreign key mechanism in the relational 
database. 
iv.  Inter-node constraint (co-constraint): This 
kind of constraint describes the presence/value 
dependencies between elements or attributes 
belonging to the same or different sub-branches of 
an XML document tree. It is usually the most 
fundamental part of data semantics.   
XML Schema as of today has already covered 
the first three kinds of constraint, but it lacks of the 
capability of expressing the inter-node constraints in 
an XML document. XincaML is proposed to 
complement it. Before we go into detail about 
XincaML, let’s take a closer look at the inter-node 
constraints. 
First, a small piece of XML data is presented 
below serving as an example of XML data that have 
inter-node constraints. 
<Contacts> 
<Person title=”Mr”> 
  <Name> John Smith </Name> 
 <Gender>Male </Gender>  
</Person> 
<Person title=”Ms”> 
  <Name> Joan Smith </Name> 
ICEIS 2004 - INFORMATION SYSTEMS ANALYSIS AND SPECIFICATION
480