Table 3: Results on DS1.
mea- arrroach1 approach2
sures train test train test
Err1 0.132 0.201 0.172 0.227
Err2 0.059 0.109 0.081 0.142
Err 0.096 0.156 0.126 0.185
OC 0.904 0.844 0.874 0.815
BC 0.868 0.799 0.828 0.773
BPC 0.937 0.879 0.911 0.847
WE 0.86 0.77 0.812 0.73
Table 4: Results on DS2.
mea- arrroach1 arrroach2
sures train test train test
Err1 0.184 0.247 0.22 0.314
Err2 0.038 0.069 0.04 0.066
Err 0.091 0.134 0.105 0.154
OC 0.909 0.866 0.895 0.846
BC 0.816 0.753 0.78 0.686
BPC 0.924 0.860 0.917 0.855
WE 0.83 0.749 0.801 0.704
eters influencing the classifier performance. The re-
sults of different map sizes on DS1 is given in Ta-
ble 6. As the map enlarges from 4 x 3 to 14 x 11, it
performs better on both training and generalization.
When more neurons (27 x 23) are used, the train-
ing error improves slightly while the generalization
error degrades significantly. The results on DS2 and
DS3 are shown in Table 7 and Table 8 respectively.
It can be concluded that the middle size of map grid
is suitable for model construction empirically, while
less units are inadequate for pattern presentation, and
more units leads to overfitting.
Table 9 shows the results obtained on balanced
and unbalanced data sets for Hybrid LVQ and some
well-known classification methods: ZeroR (a baseline
algorithm by simply predicting the majority class in
training data), VFI (voting feature interval classifier),
SMO (sequential minimal optimization algorithm im-
plementing support vector machine), k-nearest neigh-
bors (KNN, the best value of k chosen between 1 and
10), Naive Bayes, and C4.5 decision tree. Among
the seven algorithms, VFI and SMO have poor per-
formance just better than ZeroR in all data sets, fol-
lowed by KNN and Naive Bayes. Superior to the five
algorithms, hybrid LVQ performs well, close to C4.5
in terms of error and accuracy measures. However,
LVQ is a projection method as well as a classification
approach. An appeal of LVQ is the ability to detect
class structure from map visualization which makes it
a useful tool in data mining tasks. Figure 6 presents a
generated map of LVQ, the labels on the left and the
histogram of class distribution on the right. From the
Table 5: Results on DS3.
mea- arrroach1 arrroach2
sures train test train test
Err1 0.209 0.297 0.275 0.417
Err2 0.024 0.033 0.028 0.432
Err 0.076 0.108 0.097 0.152
OC 0.924 0.892 0.903 0.848
BC 0.791 0.703 0.725 0.583
BPC 0.929 0.894 0.909 0.845
WE 0.82 0.748 0.771 0.647
Table 6: Results of different map sizes on DS1.
mea- small:4x3 middle:14x11 big:27x23
sures train test train test train test
Err1 0.3 0.29 0.13 0.2 0.07 0.23
Err2 0.06 0.08 0.06 0.11 0.07 0.18
Err 0.18 0.19 0.1 0.16 0.07 0.2
OC 0.82 0.81 0.9 0.84 0.93 0.78
BC 0.7 0.71 0.87 0.8 0.93 0.77
BPC 0.92 0.91 0.94 0.88 0.93 0.82
WE 0.73 0.73 0.86 0.77 0.9 0.71
Table 7: Results of different map sizes on DS2.
mea- small:8x6 middle:15x13 big:30x26
sures train test train test train test
Err1 0.33 0.37 0.18 0.25 0.1 0.27
Err2 0.03 0.04 0.04 0.07 0.04 0.11
Err 0.14 0.16 0.09 0.13 0.06 0.17
OC 0.87 0.84 0.91 0.87 0.94 0.83
BC 0.67 0.63 0.82 0.75 0.9 0.73
BPC 0.94 0.92 0.92 0.86 0.93 0.78
WE 0.74 0.7 0.83 0.75 0.89 0.69
Table 8: Results of different map sizes on DS3.
mea- small:8x7 middle:16x13 big:31x27
sures train test train test train test
Err1 0.36 0.4 0.21 0.3 0.11 0.31
Err2 0.02 0.03 0.02 0.03 0.02 0.06
Err 0.12 0.14 0.08 0.11 0.05 0.14
OC 0.88 0.86 0.92 0.89 0.95 0.87
BC 0.64 0.6 0.79 0.7 0.89 0.69
BPC 0.92 0.87 0.93 0.89 0.93 0.81
WE 0.72 0.67 0.82 0.75 0.89 0.69
visualization, the healthy companies projected to neu-
rons in the middle of map grid and bankruptcy com-
panies projected to the surrounding neurons.
5 CONCLUSIONS
In this paper, a hybrid LVQ algorithm is presented
to solve the bankruptcy prediction problem. In or-
der to reduce the curse of dimensionality, ICA is used
as a preprocessing tool to eliminate the dimensions
ICAART 2009 - International Conference on Agents and Artificial Intelligence
154