A TOOL FOR USER-GUIDED DATABASE APPLICATION

DEVELOPMENT

Automatic Design of XML Models using CBD

Carlos Rossi, Antonio Guevara, Manuel Enciso, Jos

e Luis Caro

Dpto. Lenguajes y CC de la Computaci

on, Universidad de M

alaga, M

alaga, Spain

Angel Mora, Pablo Cordero

Dpto. Matem

atica Aplicada, Universidad de M

alaga, M

alaga, Spain

Keywords:

Software development tools, Requirements elicitation and speciﬁcation, Analysis and design, Functional de-

pendencies, Logic.

Abstract:

Beyond the database normalization process, much work has been done on the use of functional dependencies

(FDs), their discovery using mining techniques, their use in query optimization and in the design of algorithms

dealing with the implication problem etc. Nevertheless, although much research expounds the beneﬁts of using

functional dependencies, only a few modeling tools actually use them. In this work we present CBD, a new

software development tool which allows end users to specify their requirements. CBD allows the user to design

his/her own GUI for the application using forms and interface elements and it builds a meta-data dictionary

with information on functional dependencies. This data dictionary will be used to generate the uniﬁed data

model and a behavior model.

1 INTRODUCTION

Database design is not only a matter of dealing with

the data that has to be stored. Since E.F. Codd in-

troduced the Relational Model (Codd, 1970), experts

have agreed on the importance of storing both the data

and the semantics related to it.

Most relational database management systems

(DBMS) and modeling tools use dependencies in a

simple way and they rely on the speciﬁcation of pri-

mary keys. The reason is that the number of keys are

just equal to the number of tables in a database model,

but the number of functional dependencies (FDs) are

greater and they are more complex, because they may

establish a relation among attributes that are in differ-

ent tables (using the inclusion dependencies). Thus,

the main obstacle is not the inclusion of FDs but their

efﬁcient treatment.

In (Evaluation, 1000) a new architecture which

promotes the use of FDs was presented. The work

focussed on the problem of view integration and pro-

posed FDs as a key tool to discover knowledge and to

facilitate the integration task. In addition to this ar-

chitecture, a set of efﬁcient methods to manage func-

tional dependencies should be considered. In the lit-

erature, functional dependencies are dealt with using

ad hoc methods designed for speciﬁc purposes which

are difﬁcult to extend. An alternative approach is to

use a proper subset of classical logic to reason about

FDs. There are many equivalent logics in the litera-

ture which follow this approach and all of them are

very similar to that proposed by (Armstrong, 1974).

As we will mention later however, these logics are

not appropriate for automated reasoning. In (Cordero

et al., 2002) a new kind of FD logic which allows the

design of automated methods is presented (Cordero

et al., 2002).

In this work we present CBD

, a new modeling

tool which allows the user to participate directly in the

design process. Unlike the classical tools, CBD pro-

vides an interface to design directly each user view

of the application. The information is captured using

GUI elements (forms, buttons, etc) and they are stored

in a meta-data dictionary. We have used an XML

database instead of a relational database because we

need ﬂexible data storage, easy to share with other

Spanish acronym for Cooperation in Databases.

195

Rossi C., Guevara A., Enciso M., Luis Caro J., Mora A. and Cordero P. (2010).

A TOOL FOR USER-GUIDED DATABASE APPLICATION DEVELOPMENT - Automatic Design of XML Models using CBD.

In Proceedings of the 5th International Conference on Software and Data Technologies, pages 195-200

DOI: 10.5220/0003009401950200

 SciTePress

applications. The FDs deduced from the GUI ele-

ment connections provide some valuable information.

CBD works by using the XML dictionary to gene-

rate the uniﬁed data model and the application code

to cover user requirements. In (Nelson et al., 2005)

the quality of data models is studied and redundancy

elimination is cited as one of the important issues for

increasing software quality. We will use an automated

method based on FD Logic to reduce redundancy in

the CBD dictionary.

The paper is organized as follows: in section 2

CBD is introduced and a comparison with other com-

mercial tools is shown. Section 3 is devoted to pre-

senting some efﬁcient methods to reason with FDs

and an explanation of how to use these methods to

reduce redundancy in the CBD XML data dictionary.

After the conclusions are drawn and the plans for fu-

ture work are outlined in section 5, we include an ap-

pendix which illustrates how CBD works.

2 SOFTWARE DEVELOPMENT

TOOLS

Nowadays, software development can be approached

in two ways depending on who leads it:

1. Software development by professional teams,

who gather user requirements by means of inter-

views, document inspection, etc. In this case, de-

velopers apply a direct engineering process, using

a variety of CASE tools, such as (UML) model-

ing tools, integrated development environments,

project management tools, etc.

2. Software development by the end user by means

of application generators. These tools allow users

to edit the application interface and to generate

the end application (usually with a web architec-

ture). This low-cost approach, although adequate

for individuals and small businesses, is not op-

timal, since these tools are very limited in the

management of non-trivial data relations and they

have other drawbacks that reduce software qual-

ity.

CBD combines both types of development and

also improves software quality using automated rea-

soning techniques. As we shall see in the following

section, we apply an efﬁcient transformation to the

CBD data model and we reduce redundancy in the

speciﬁcation of the FDs. The simpliﬁcation method

produces a reﬁned data model and consequently the

corresponding relational database will be easier to

manage.

2.1 CBD Overview.

CBD was developed in order to solve one of the

main problems in professional software development:

vagueness and incorrectness in gathering user require-

ments, which leads to serious ﬂaws in the ﬁnal prod-

uct. This problem, widely detailed in Software En-

gineering literature (Pressman, 2006), has been tra-

ditionally approached by introducing new modeling

techniques or applying process models that increase

client interaction, such as iterative models (Jacobson

et al., 2000) or agile processes (Martin, 2003). Never-

theless, in these approaches a human (the analyst) is

still needed to translate the information provided by

the client into requirements and models. This trasla-

tion usually causes mistakes generated by errors in

analyst interpretation or by ambiguities and omissions

in the client information.

CBD aims to solve this problem by avoiding the

leading role of the analyst in requirement gathering:

in CBD the end user records directly the input for

requirements speciﬁcation by means of collaborative

techniques. More speciﬁcally, the user designs in-

tuitively the GUI s/he wishes to have in their ap-

plication, and then CBD that generates a catalogue

of functional and information requirements (Guevara

et al., 2007; Carrillo et al., 2008). The CBD engine

processes this catalogue and it creates a relational

database for the user application, as well as structural

(classes), use case and behaviour (interaction) models

in UML notation. These models are stored in XMI

format. In this way, models can be imported in widely

used CASE tools (MagicDraw, Enterprise Architect,

Rational, etc), where an analyst could optimize the

results automatically generated by CBD.

CBD manages semistructured data using eXist-

db, a native XML database management system. In

particular, requirements metadata, as well as user in-

terface speciﬁcations are stored in XML documents.

This architecture allows the application of reason-

ing methods for the treatment of dependencies in

semistructured data and to optimize the relational

database generated by CBD. At the moment, this task

is carried out by a different application not integrated

inside the CBD tool.

2.2 CBD Versus other Similar Tools

To explore the beneﬁts of CDB, we believe it is nec-

essary to compare it with other modeling tools. Be-

cause of the nature of CDB, we have chosen for

this comparison those tools geared to the manage-

ment, generation and creation of forms. There-

fore, for this study we have considered the follow-

ICSOFT 2010 - 5th International Conference on Software and Data Technologies

196

Figure 1: Deﬁnition of relations.

ing applications Wufoo (Inﬁnity Box Inc., 2010),

FormSpring (FormSpring LLC, 2010), FormAssem-

bly (Veer West LLC, 2010), JotForm (Interlogy

LLC, 2010), FormLogix (FormLogix.com, 2010) and

Leonardi (Groupe W4 S.A, 2010), all of which have

an analogical approach to CBD. The main feature

common to all these tools is the construction of

an automatic database model, using forms designed

by users who may have no previous knowledge of

databases.

We have organized the comparison according to

the double use which may be made of these tools: ei-

ther by users as a tool to create their own application

or as a tool which captures requirements. First we will

present the characteristics necessary for each of these

uses, indicating which of the tools of the comparison

have these particular features. Then, to close the sec-

tion, we will present a comprehensive table of all the

tools and their respective properties.

Firstly we compare the design and execution of

the forms designed by the users. In this comparison,

the following characteristics should be observed:

1. Most of the applications studied, exclusively im-

plement systems with only one form, or, if sev-

eral are possible, they are unrelated. Only Wu-

foo and Leonardi have multi-form mode possible

and in the latter case it is extremely complicated

to use. CBD addresses this issue and implements

a solution based on the creation of a multi-form

project, with the possibility of establishing navi-

gation ﬂows.

2. Secondly, it is important to study the expressive-

ness of the data models that can be designed. As

can be observed in the comparative table in Fig-

ure 2, most of the tools only allow relations 1:n.

Furthermore, the approach of the tools studied is

to consider these relations as simple sets which

can be selected and displayed, they are not tables.

In this respect, both CBD and Leonardi both al-

low relations 1:n and n:m including parameterized

grids (see Figure 1) . It should be noted that CBD

can deﬁne reﬂexive relations 1:m, a characteristic

which distinguishes it from the other tools.

3. One characteristic which is common to all the

tools is the management of end users in the ap-

plication generated. However, what makes CBD

different is that it contemplates the participation

of various users in the design, allowing user col-

laboration to take place. Furthermore, in keeping

with this approach, CBD also includes a system

that allows users to control and validate different

versions.

As previously mentioned, as well as generating

the application, it is possible to use these tools to ex-

tract requirements and to elaborate models from these

requirements. In this second part of the comparison,

we wish to highlight the following characteristics:

1. CBD allows data model generation in SQL. Other

tools such as FormSpring or FormAssembly also

allow the data model to be exported, but in this

case it is only possible to do so using CSV ﬁles,

which must be processed manually in order to be

able to create the relational schema on a database

management system.

2. Another important feature is that CBD offers the

analyst a control panel for the forms and the rela-

tions that the users have designed, thereby allow-

A TOOL FOR USER-GUIDED DATABASE APPLICATION DEVELOPMENT - Automatic Design of XML Models

using CBD

197

ing analysts to obtain better knowledge of the sys-

tem requirements. This characteristic is exclusive

to CBD.

3. As we mentioned previously, our intention is for

CBD to be useful also to professional develop-

ers. For this reason, CBD allows knowledge ac-

quired in requirements gathering to be exported

to other modeling environments (MagicDraw or

Enterprise Architect for example). This is done

in the form of UML class models, use cases and

sequence diagrams, using standard XMI format,

in such a way that this information can be inte-

grated with other information obtained with these

professional modeling tools.

Figure 2 illustrates all the aspects considered in

this study. The table shows in great detail each func-

tionality for all the tools included in this work:

Figure 2: Tools comparison.

From this study, it can be concluded that CBD per-

forms much better than the other tools, and its main

advantages are the following:

• Collaborative design of forms and project man-

agement.

• Expressiveness in the models: it contemplates the

relations 1:n, n:m and the reﬂexive relations 1:n

• Exportation of the the data model in SQL.

• Advanced control panel for expert users (system

analysts).

• Possibility of modeling the navigation between

forms.

• Exportation of models in XMI format for the inte-

gration with team-based modeling environments.

These characteristics illustrate how CBD is a tool

that allows both form design and the generation of ap-

plications directly by non-expert users. However, un-

like the other tools in this ﬁeld, CBD is not intended

only for this use and it can also be used in profes-

sional environments, in the same way as commercial

modeling tools, and integrated with these tools. In the

appendix at the end of this work, the CBD work inter-

face and its most important functions are presented in

more detail.

3 FUNCTIONAL DEPENDENCIES

MANAGEMENT

The normalization theory highlights the importance

of the semantic information collected in the FDs of a

schema. In (Armstrong, 1974) the author presents the

well-known Armstrong’s Axioms, a sound and com-

plete inference system for FDs which can be consid-

ered the pioneer for other equivalent FD logics(Fagin,

1977; Paredaens, 1982). All of them have a similar

pattern, their axiomatic systems are strongly based on

the transitivity rule which avoids the development of

automated methods directly based on logics.

In (Cordero et al., 2002) a novel inference rule

for FDs was presented. This rule is the core of a

new sound and complete FD logic named Simpliﬁ-

cation logic for FDs (SL

). The new inference sys-

tem does not include the transitivity rule as a primitive

rule. This property allows us to consider Simpliﬁca-

tion logic as an executable logic and to develop efﬁ-

cient deduction methods directly based on the FD in-

ference system. In the following we summarize SL

Deﬁnition 1. We deﬁne the SL

logic as the pair

, S

FDS

) where L

is the language

{X →Y | X , Y ∈ 2

Ω

with Ω being the attributes set}

and S

FDS

is the following axiomatic system:

bAxc Axiom scheme: if Y ⊆ X

` X →Y

bFRc Fragmentation rule: if Y

⊆ Y

X →Y ` X →Y

bCRc Composition rule:

X →Y, U →V ` XU →YV

As usual, XY is used as the union of sets X, Y ; X ⊆ Y

as X included in Y ; Y − X are the elements in Y that are not

in X (difference); and → is the FD.

ICSOFT 2010 - 5th International Conference on Software and Data Technologies

198

bSRc Simpliﬁcation rule: if X ⊆ U and X ∩Y = ∅

X →Y, U →V ` U -Y →V -Y

Moreover, we have the following derived rule:

brSRc r-Simpliﬁcation rule: if X ⊆ UV , X ∩Y = ∅

X →Y, U →V ` U →V -Y

The main characteristic of SL

logic is the simpli-

ﬁcation rule, which was originally developed to be

applied to the redundancy elimination problem. The

inference rules can be rewritten as equivalences that

allow reducing the speciﬁcation

{X →Y } ≡ {X →Y -X}

{X →Y, X →U} ≡ {X →YU}

{X →Y, U →V } ≡ {X →Y, U-Y →V -Y } if X ⊆ U

and X ∩Y = ∅

{X →Y, U →V } ≡ {X →Y, U →V -Y } if X ⊆ UV

and X ∩Y = ∅

Therefore, applying the SL

rules, the redun-

dancy (Cordero et al., 2002) can be eliminated, the

closures (Mora et al., 2006) can be calculated and

the implication problem (Mora et al., 2004) can be

solved. This means that by simply applying the infer-

ence rules of the logic, these problems can be solved

with a similar cost to that of the best algorithms ex-

isting in the literature. However in this case using the

logic directly, allows us to analyze the reasoning pro-

cess.

4 FUNCTIONAL DEPENDENCIES

IN ACTION

From the point of view of the practical implementa-

tion of FDs in relational database management sys-

tems, it is clear that FD technology has begun to be

incorporated in commercial tools. Currently, almost

all of them implement the concept of primary key and

candidate key, which is stronger than that of FDs.

At present, it is assumed that relational database

designers should normalize their tables during the de-

bugging process on the original design, which lim-

its the automation of the design process. If the man-

agement systems incorporated directly the concept of

FDs, then these systems could be automatically han-

dled and in this way intelligent debugging of the DB

could be carried out.

The technology for performing this process has

already been developed, but as yet it has not been

included in the management systems. In our opin-

ion the reasons for this are two-fold: ﬁrstly there is

no formal framework for the validation and manipu-

lation of FDs using the logic (Cordero et al., 2002)

and secondly there are no tools existing to gather re-

quirements which allow the user or designer to in-

clude their knowledge on FDs.

In this work we present an approach to this prob-

lem, using the formal framework provided by the

logic and using the information found in the

XML dictionary of CBD In this way, we use CBD as

a tool to capture the requirements with the form being

the main element for the user. We wish to point out

that in CBD each form is not moved directly to a ta-

ble, as is the case for the rest of the form design tools.

As we have mentioned, CBD allows a connection be-

tween various forms belonging to the same user, al-

lowing greater information control and more possi-

bilities as regards restrictions than if there were only

one key per table. This change in the approach allows

the existence of FDs among the attributes input by the

users to be deduced.

Our approach has been to use SL

-based meth-

ods for FDs about the information contained in the

data dictionary of the CBD. Speciﬁcally, we car-

ried out the application of the algorithm presented

in (Cordero et al., 2002) to eliminate redundancy in

the information contained in the XML database. Cur-

rently this debugging process of the restrictions stored

in the CBD is done externally of the CBD, and it is

executed before the CBD generates the uniﬁed rela-

tional database model. The algorithm uses the CBD

database, consulting the metadata speciﬁcation and

extracting the information on the FDs. This informa-

tion is debugged and returned to the CBD so that a

model with less redundancy is generated. This debug-

ging process is efﬁcient, and as presented in (Cordero

et al., 2002), it has a lower cost than the other algo-

rithms dealing with the same problem in the bibliog-

raphy.

5 CONCLUSIONS AND FUTURE

WORK

In this work, we present CBD, a tool which allows

the direct participation of the users. The tool itself

has been presented and a comparison made with other

tools created for the design of forms and the genera-

tion of applications based on this design. In this com-

parison, it can be seen that CBD is a more powerful

tool than the others presented in every sense. Never-

theless our intention is not to bind the use of CBD to

the problem of form design, but rather to open it up to

a greater use in the area of modeling.

The prototyping and generation of applications

A TOOL FOR USER-GUIDED DATABASE APPLICATION DEVELOPMENT - Automatic Design of XML Models

using CBD

199

from an interface design created by the user is a func-

tionality offered by some solutions. Nevertheless, as

far as we know none of them manages FDs nor do

they generate UML models. In this way, the user can

generate better quality applications and we overcome

some of the limitations of other automated application

generators. This differentiates CBD from the other

tools available, as by using the forms designed by the

user, systems analysts can obtain requirements in the

initial development phases of an information system.

Therefore, CBD is extremely useful as a tool for gath-

ering requirements as it allows extracting knowledge

and exporting it for analysis and design processes.

The second contribution of this work is the use

of SL

logic to eliminate redundancy in the XML

database, which works as a data dictionary of CBD.

This efﬁcient use is possible using an inference sys-

tem, which provides not only soundness, but also al-

lows us to explain the reasoning which has been fol-

lowed in the execution of the application. Our goal

when designing the SL

logic was to clear the way

for future construction of an automatic technique that

systematizes the use of rules of the axiomatic sys-

tem. In our opinion, it is not enough to know simply

whether a FD is redundant or not, we also need to be

able to report on which FDs allow that deduction and

the SL

rules used, something that is not possible if

we use the indirect methods that are commonplace in

the literature.

As regards future work, we have begun work in

two areas:

• We are working to produce a second version of

CBD that will generate web applications from its

model. This application can be deployed in the

user servers, or hosted in CBD servers in a com-

bination of PaaS (Platform as a Service) and SaaS

(Software as a Service) models.

• We wish to incorporate the debugging algorithms

of the SL

logic into the CBD. The objective of

this integration is not only to make the tool easier

to use, but to be able to use the information the

algorithms provide on the reasoning to help the

analyst explain the overlaps in the different views

of the model.

REFERENCES

Armstrong, W. W. (1974). Dependency structures of data

base relationships. In IFIP Congress, pages 580–583.

Carrillo, A. L., Falgueras, J., Dianes, J. A., and Guevara, A.

(2008). A guided interface for web interaction. In Pro-

ceedings of ICEIS 2008 - 10TH International Confer-

ence On Enterprise Information Systems., pages 70–

77.

Codd, E. F. (1970). A relational model of data for large

shared data banks. Commun. ACM, 13(6):377–387.

Cordero, P., Enciso, M., Guzm

an, I. P., and Mora, A.

(2002). Slfd logic: Elimination of data redundancy in

knowledge representation. Lecture Notes in Artiﬁcial

Intelligence, 2527:141–150.

Evaluation, B. (1000). This is a paper written by one of

the authors of this paper. In We have remove this data

following the submission guidelines for authors. Pub-

lished.

Fagin, R. (1977). Functional dependencies in a relational

database and propositional logic. IBM. Journal of re-

search and development, 21 (6):534–544.

FormLogix.com (2010). http://www.formlogix.com.

FormSpring LLC (2010). www.formspring.com.

Groupe W4 S.A (2010). http://www.lyria.com.

Guevara, A., Caro, J. L., Leiva, J. L., and Gmez, J. L.

(2007). I. comis: Cooperative methodology for in-

formation systems. In Proceedings of ENC’2007 Ad-

vances in Computer Science, pages 81–87.

Inﬁnity Box Inc. (2010). http://wufoo.com/.

Interlogy LLC (2010). http://www.jotform.com/.

Jacobson, I., Booch, G., and J., R. (2000). El proceso uniﬁ-

cado de desarrollo de software. Addison Wesley.

Martin, R. (2003). Agile software development : principles,

patterns, and practices. Prentice Hall.

Mora, A., Aguilera, G., Enciso, M., Cordero, P., and

Guzm

an, I. P. (2006). A new closure algorithm based

in logic: Slfd-closure versus classical closures. In-

teligencia Artiﬁcial. Revista Iberoamericana de IA,

31:31–40.

Mora, A., Enciso, M., Cordero, P., and Guzm

an, I. P.

(2004). The functional dependence implication prob-

lem: optimality and minimality. An efﬁcient pre-

processing transformation based on the substitution

paradigm. Lect. Notes in Artiﬁcial Intelligence,

3040:136–146.

Nelson, H. J., Poels, G., Genero, M., and Piattini, M.

(2005). Quality in conceptual modeling: ﬁve ex-

amples of the state of the art. Data Knowl. Eng.,

55(3):237–242.

Paredaens, J. (1982). A universal formalism to express de-

compositions, functional dependencies and other con-

straints in a relational database. Theoretical Computer

Science, 19 (2):143–160.

Pressman, R. (2006). Ingenieria del Software. McGraw

Hill.

Veer West LLC (2010). www.formassembly.com.

ICSOFT 2010 - 5th International Conference on Software and Data Technologies

200