the IS project; the locale; an incidence of the
evolution path in each of the development was
made; the IS project; the locale; an incidence of the
evolution path in each of the development decisions.
Thus, we found 213 development decisions where
they were present. Then the data set was converted
into a data matrix based on the presence of a specific
feature. For a single development decision, called s
sample the maximum number of ISPI evolution
paths was four. The data consisted of 26 binary
variables: 14 variables for ISPI evolution paths
(“wall technique/wall picture and entity analysis” to
“operating environment tools (different tools for
different environments”), three variables for three
locales, three variables for four time generations,
four variables for the four ISPI categories, and one
variable for internally or externally developed
ISPIS. The presence of feature was denoted by 1 and
absence by 0 (like c.f. Ein-Dor and Segev, 1993).
(ISPI time generation one was left out due to lack of
data).
From these 26 variables 14 were selected as
independent variables which were used to explain
the rest of the 12 dependent variables. The
independent variables were (1) Description methods
EVO1: wall technique/wall picture and entity
analysis, (2) Description methods EVO2: methods
for strategic development, denoted as process
modelling approaches, (3) Description methods
EVO3: design methods and techniques, such as
OMT, OO etc., (4) Project management and control
procedures EVO1: phase models, (5) Project
management and control procedures EVO2: project
instructions and management, (6) Project
management and control procedures EVO3:
standards and instructions, (7) Development tools
EVO1: Carelia, Visual Basic, Carel etc, (8)
Development tools EVO2: ADW, S-designer,
power-designer, (9) Development tools EVO3: data
communications tools, (10) Development tools
EVO4: database handling tools and databases, (11)
Technology innovations EVO1: programming
languages, (12) Technology innovations EVO2:
query languages, (13) Technology innovations
EVO3: modular computing (programming
procedures and techniques), and (14) ISPI
Technology innovations EVO4: operating
environment tools. The reason for this selection of
the independent and dependent variables was based
on our research question.
The variation in the dependencies in the ISPI
evolution paths was modelled with the component
plane and the U-matrix (unified distance matrix)
representations of the Self-Organizing Map (SOM)
(Kohonen, 1989, 1995; Ultsch and Siemon, 1990).
The SOM is a vector quantisation method to map
patterns from an input space V
I
onto typically lower
dimensional space V
M
of the map such that the
topological relationships between the inputs are
preserved. This means that the inputs, which are
close to each other in input space, tend to be
represented by units (codebooks) close to each other
on the map space which typically is a one or two
dimensional discrete lattice of the codebooks. The
codebooks consist of the weight vectors with the
same dimensionality as the input vectors. The
training of the SOM is based on unsupervised
learning, meaning that the learning set does not
contain any information about the desired output for
the given input, instead the learning scheme try to
capture emergent collective properties and
regularities in the learning set. This makes the SOM
especially suitable for our type of data where the
main characteristics emerging from the data are of
interest, and the topology-preserving tendency of the
map allows easy visualisation and analysis of the
data.
Training of the SOM can be either iterative or
batch based. In the iterative approach a sample,
input vector x(n) at step n, from the input space V
I
, is
picked and compared against the weight vector w
i
of
codebook with index i in the map V
M
. The best
matching unit b (bmu) for the input pattern x(n) is
selected using some metric based criterion, such as
⎪⎪x(n)-w
b
⎪⎪ = min
i
⎪⎪ x(n)-w
i
⎪⎪, where the parallel
vertical bars denote the Euclidean vector norm. The
weights of the best matching and the units in its
topologic neighbourhood are then updated towards
x(n) with rule w
i
(n+1) = w
i
(n) +
α
(n) h
i,b
(n) (x(n)
– w
i
(n)), where i
∈
V
M
and 0
≤α
(n)
≤
1 is a scalar
valued adaptation gain. The neighbourhood function
h
i,b
(n) gives the excitation of unit i when the best
matching unit is b. A typical choice for h
i,b
(n) is a
Gaussian function. In batch training the gradient is
computed for the entire input set and the map is
updated toward the estimated optimum for the set.
Unlike with the iterative training scheme the map
can reach an equilibrium state where all units are
exactly at the centroids of their regions of activity
(Kohonen, 1995). In practice batch training can be
realised with a two step iteration process. First, each
input sample is assigned best matching unit. Second,
the weights are updated with
w
i
=
∑
x
h
i,b(x)
x /
∑
x
h
i,b(x)
. When using batch training
usually little iteration over the training set are
sufficient for convergence. In our experiences we
INFORMATION SYSTEM PROCESS INNOVATION EVOLUTION PATHS
173