
Table 2: Intervals estimated for the attributes
Parameter CRCCRT GENERAL
ADMPPS - [0.5, 1.0]
ASIMDY [5, 15] [10, 120]
DEVPRT - [0.5, 0.95]
HIREDY [5, 10] [5, 40]
INUDST - [0.2, 1.0]
MXSCDX [1, 1.2] -
TRNSDY [5, 10] -
TRPNHR - [0.05, 0.4]
UNDEST - [0.05, 0.6]
put variables: necessary effort to carry out a project
(technicians-day), development time (days) and qual-
ity (average number of mistakes by task). Specifi-
cally, the attributes whose values we want to know
are: average delay in hiring, average delay in the
adaptation of new technicians, average delay to carry
out a dismissal and the maximum percentage of delay
allowed in delivery time.
5.1 Description of the Databases
To carry out a study with GAR, we have simulated
an already finished project, whose initial values are
in table 1, following two strategies and, therefore, we
have generated two databases. The first one, which
we will call CRCCRT, has been generated establish-
ing a fast hiring policy with initial restrictions on de-
livery time
5
.
The second one, which we will call GENERAL,
has been generated with a less restrictive policy, in
such a way that we have included in the simulation a
greater number of attributes to be estimated and we
have expanded the ranges of the values that indicate
that a project is good for the output variables.
In table 1, we show the input attributes and the out-
put variables used in the simulation together with a
brief description, the unit in which they are measured
and the initial value. In table 2, we show the intervals
used for each one of the attributes with some level
of uncertainty, for both databases. As can be seen,
in GENERAL the intervals for attributes ASIMDY
and HIREDY form all the possible range in the de-
velopment organization. This is so, since with GEN-
ERAL database we pretend to analyze the influence
that a greater number of attributes have on the project.
In table 3 we show the cut values, that is, the val-
ues considered as good for both databases, CRCCRT
and GENERAL, by the project manager. We have
defined a cut for CRCCRT and two cuts or percent-
5
Fast hiring implies that hiring (HIREDY), dismissal
(TRNSDY) and adaptation of new technicians (ASIMDY)
have to be realized quickly (MXSCDX), that is to say, in a
short period of time (see table 2)
ages for GENERAL, which generate certain values
for the output variables. These cuts are only applied to
some of the output variables (except for cut 1 in GEN-
ERAL, which is applied to the three variables). Each
one of these cuts establishes different correct scenar-
ios. These cut values indicate the goals we pretend
to cover. For example, the cut in CRCCRT has as
goal to obtain management rules that should permit
to maintain the delivery time and the quality of the
project below the indicated values, independently of
the value obtained by the effort necessary to carry out
the project.
Together with the cut value, we show the maximum
percentage, departing from the initial value estimated
by the manager of the project, that must not over-
come that variable in order to realise a project that
adjusts to the initial estimations. For example, in the
cut of CRCCRT we consider good values for deliv-
ery time those included between the initial estimation
(320 days) and a permitted margin of 10% over such
estimation (352 days), independently of the value ob-
tained for the cost of the project. In this same table,
we also offer information about the number of cases
that have been categorised as acceptable. As can be
seen, and as could be supposed from the very begin-
ning, the number of cases decreases as the restrictions
on the project increase. We can deduce, that too many
restrictions could cause a low probability of carrying
out an acceptable development project or even, that
the probabilities were non-existent. For example, the
fact of imposing a restriction on effort (JBSZMD) for
the database CRCCRT, would provoke that the num-
ber of cases were practically nothing.
5.2 Analysis of CRCCRT
The database CRCCRT has been generated by impos-
ing, in the simulation, restrictions on the attributes re-
lated to personnel hiring, making it to be fast, and,
besides, by imposing strong initial restrictions on de-
livery time.
With the strategy followed in the generation of this
database, we pretend to know the value of the at-
tributes related to personnel management that permit
to obtain good results for delivery time and to main-
tain acceptable levels in the project quality, indepen-
dently of the value obtained for the necessary effort to
carry out the project.
5.2.1 Time and Quality
This cut induces a set of rules on the input attributes
for this database, only fulfilling restrictions on deliv-
ery time (SCHCDT) and quality (ANERPT), accord-
ing to the cuts established in table 3.
The association rules discovered, where the con-
sequent is formed by the intervals of the variables
APPLYING DATA MINING TO SOFTWARE DEVELOPMENT PROJECTS: A CASE STUDY
57