database, and then transfer it into XML schema.
Data layer, which is a legacy relational database that
stores all the data to be analyzed and converted into
XML.
Reverse engineering layer, which extracts an ER model
from the input database.
Transformation layer, which transfers the ER model
into XML schema.
Graphical output layer, which shows the result for each
step (i.e., foreign keys table, candidate keys table, pri-
mary keys table, RID graph, and XML schema, etc).
Undoubtedly, reconstructing an ER model from a
legacy database, and writing an XML schema file
both are heavy and tedious jobs, especially for a large
real application. The users could be relieved of this
heavy load by using our framework. On the other
hand, the users’ knowledge could also be involved in
this system. However, compared to reconstructing an
ER model and writing a long XML schema file from
scratch, the human’s mental workload is greatly re-
duced with our framework.
Our framework presented in this paper has the fol-
lowing advantages compared to the work described
in (Kleiner and Lipeck 2001), where the authors show
how to obtain a DTD for data whose structure is de-
scribed by a conceptual data model. In brief, they
present the translation of all constructs of the ER
model to DTDs and integrate them into an algorithm.
• Our framework could be used not only for a normal re-
lational database system, but also for a legacy relational
database system.
• We choose XML schema instead of DTD; XML schema
provides a more flexible and powerful mechanism than
DTD. We can easily present each entity in the ER model
by using XML complex-Type. And also we can use
“key” and “keyref” to declare the attributes uniqueness,
composite keys, and referential constraints.
• Our prototype gives users a direct visualization of the
output obtained from each phase of the process.
• The expected human workload is considerably reduced
compared to the approach described in (Kleiner and
Lipeck 2001).
6 CONCLUSIONS
In this paper, we presented a novel approach to ex-
tract an ER model from a legacy relational database,
and then convert the ER model to a corresponding
XML schema; i.e., by applying reverse engineering
followed by forward engineering. We preserve as
much information as we can from the given relational
schema to the XML schema. Our approach not only
works for commercial relational databases but also for
legacy relational databases. We use the XML schema
instead of the DTD schema; the advantages of this is
that we can use a complex-type to represent each re-
lational table; “key” and “keyref” are great features
introduced in XML schema. They replace and extend
the capability of “ID”, and “IDREF” and “IDREFs”
in DTD. We use “key” and “keyref” to specify the re-
lationship between tables, the uniqueness scope and
multiple attributes to create the composite keys. We
can also determine M:N and n-ary relationships, so
we produce a XML schemas and XML documents
for the data stored in databases without knowing any-
thing about the catalog information. Currently, we are
working on improving the prototype to provide flex-
ible visual querying facility by allowing the user to
choose from the displayed RID graph the tables and
even the attributes to be displayed in XML format.
REFERENCES
R. Alhajj, “Extracting the Extended Entity-
Relationship Model from a legacy Relational
Database, ” Information Systems, Vol.28, No.6,
pp.597-618, 2003.
M. Carey, et al, “XPERATO: Publishing Object-
Relational Data as XML,” Proc. of the Interna-
tional Workshop on Web and Databases, May
2000.
J. Cheng and J. Xu, IBM DB2 XML Extender, IBM
Silcom Valley, February, 2000.
M.F. Fernandez, W.C. Tan, and D. Suciu, “SilkRoute:
Trading between Relational and XML,” Proc.
of the International Conference on World Wide
Web, May 2000.
J. Fong, F. Pang, and C. Bloor, “Converting Rela-
tional Database into XML Document,” Proc. of
the International Workshop on Electronic Busi-
ness Hubs, pp61-65, Sep. 2001.
G. Kappel, et al, “X-Ray - Towards Integrating XML
and Relational Database Systems,” Proc. of the
International Conference on Conceptual Model-
ing, pp. 339-353, Salt Lake City, UT, Oct. 2000.
C. Kleiner and U.W. Lipeck, “Automatic Genera-
tion of XML DTDs from Conceptual Database
Schemas,” University of Hannover, Germany,
Sept 2001.
D. Lee, et al, “Nesting based Relational-to-XML
Schema Translation,” Proc. of the International
Workshop on Web and Databases, May 2001.
M. Mani, D. Lee, and R. Muntz, “Semantic Data
Modeling using XML Schemas,” Department of
Computer Science, University of California, Los
Angeles, 2001.
V. Turau, “Making Legacy Data Accessible for XML
applications,” 1999, http://www.informatik.fh-
wiesbaden.de/ turau/ps/legacy.pdf.
CONVERTING LEGACY RELATIONAL DATABASE INTO XML DATABASE THROUGH REVERSE
ENGINEERING
221