output is the finding that to achieve better results with
normalized data, the output values used for learning
(initially 0 and 1) must be reduced by the reduction
factor. Its appropriate setting was found by genetic
algorithm and was 0.2 or 0.3 (see the Reduction factor
row in Table 1).
The different number of 2D cells in different
scenarios also shows that model with extended input
set was able to obtain best results in the scenario with
fewer cells in the 2D layer, than the model with
standard input. The learning and overall performance
of the model with extended input were thus more
effective.
The sample visual outputs of the 2D network
shown in Figure 4 demonstrate the effect of modified
learning. The cells representing the patterns rated 0
are light grey, the patterns rated 1 are black. We get
"clean" colors for cells containing input patterns
included in only one category, for cells containing
patterns of different categories the color is mixed,
which respects the number of patterns in a cell with
different output categories. The influence of
additional output information on the final network
setting is quite apparent (right), and the fragmentation
from using the classical learning algorithm (left)
almost did not occur.
Figure 4: Influence of modified learning algorithm (Jelínek,
2018).
The results can still be improved by using
a hierarchical modification of the model; the added
value, however, is not so high in this case (quality
improvement of 1.71 %). The question, however, is
whether the appropriate data was used to maximize it.
In this case, the training set was evenly generated, but
the benefit could be more significant with data with
unequally represented output categories or different a
priori probabilities of them.
The second group of experiments was aimed at
comparing model results with other approaches on
standard datasets. They were obtained from the UCI
Machine Learning Repository (Dheeru and
Taniskidou, 2017). Three sets were selected with
different focus and number of input and output
attributes.
The first set is the Letter Recognition Data Set
(Lett) presented in (Frey and Slate, 1991). Input data
derive from black and white images of 26 alphabet
letters written in 20 different fonts; more images were
obtained by inserting random noise into existing ones.
Each input vector contains 16 values resulting from
the calculation of one-bit raster image values; the
output consists of a single value - the letter captured
on the figure. The set contains a total of 20 000
images. The rule-based classification method is also
presented in (Frey and Slate, 1991); the best rate of
correctly classified patterns is 82.7%.
To verify the presented model two disjoint sets
were created - for training (16000 patterns) and
testing (4000 patterns). The output was modified to
26 single-bit attributes (only one output value is one
for each letter, others are zero). Since the output is,
unlike the model description, not B
1
but B
26
, it was
necessary to select the winning attribute for the
output. The highest output attribute value does it.
Similar procedures also were used in the experiments
described below.
Experiments with this dataset did not reach the
expected values and did not achieve the reference
value mentioned in (Frey and Slate, 1991). The main
reason for this is difficult to identify, but it can be due
to the model parameter’s limits, especially in
conjunction with the R
16
> B
26
nonlinear
transformation. Inputs were pre-processed here,
which has reduced their number but has increased the
requirements on generalization capabilities of the
model too.
The second dataset was the Semeion Handwritten
Digit Data Set (Sem) described and used in
(Buscema, 1998). The set contains 1593 digital black
and white images in a 16x16 resolution. The output is
the only value identifying the digit. The article also
states the best classification using a combination of
several neural networks at 93.09%.
The input for our model was 256 binary attributes
(picture 16 x 16) and output ten binary values with the
same meaning as in the previous set and with the same
criterion for the best output selection. For the training
of the model, 1200 patterns were randomly selected,
for testing the remaining 393 ones (as recommended).
The results obtained with this dataset have already
met the expectations. The model was able to exceed
the reference value in the classification quality
(Buscema, 1998), and the influence of the extended
input data on the outputs of learning the model was
also positive.
The last used dataset was designed for testing of
accurate detection of room occupancy from data
obtained from several types of sensors. The set was
Kohonen Map Modification for Classification Tasks
589