Table 1: Average accuracy for linear SVM and 3-NN clas-
sifiers for RUFS on discrete features obtained by UFD
(∆ = 0.05range(X
i
),q = 8). L is the cumulative relevance
threshold as in (1). The best accuracy is shown in bold, and
the symbol * signals multi-class problems.
(Meyer et al., 2008) Our Approach
EIB EFB UFD + RUFS
Dataset SVM 3-NN SVM 3-NN L SVM
SRBCT* 83.13 90.36 79.52 84.34 0.8 100.00
Leukemia1* 91.67 97.22 88.89 90.28 0.8 98.41
DLBCL 90.91 87.01 94.81 93.51 0.7 95.67
9 Tumors* 10.0 16.67 15.0 23.33 0.7 84.89
Brain Tumor1* 65.0 65.0 65.0 66.67 0.7 96.67
11-Tumors* 60.32 50.57 53.45 55.17 0.7 94.55
14-Tumors* 19.48 16.56 22.4 29.87 0.7 76.2
Figure 1 plots the accuracy (average over ten runs
with different random train/test partitions) for the
RUFS and RSUFS algorithms on UFD-discretized
features, as functions of the average number of fea-
tures m (computed by assigning values in the inter-
val [0.6,0.9] to the L and η parameters, respectively).
The horizontal dashed lines represent the average ac-
curacy on the original features, without and with dis-
cretization (blue and green lines, respectively). The
0 1000 2000 3000 4000 5000 6000
72
74
76
78
80
82
84
# Features (m)
[%]
Accuracy on the 9−tumors Dataset
Original
UFD
UFD + RUF
UFD + RSUF
Figure 1: Average accuracy of the linear SVM classifier
(ten runs, with different random train/test partitions) for the
RUFS and RSUFS algorithms on features discretized by
UFD and on the original features.
use of UFD shows improvement (about 9 %) as com-
pared to the use of the original features; the use of
RUFS and RSUFS further improves these results, us-
ing small subsets of features.
5 CONCLUSIONS
In this paper, we have proposed unsupervised meth-
ods for feature discretization and feature selection,
suited for microarray gene expression datasets. The
proposed methods follow a filter approach with rel-
evance and relevance/similarity analysis, being com-
putationally efficient in terms of both time and space.
Moreover, these methods are equally applicable to bi-
nary and multi-class problems, in contrast with many
previous approaches, which perform poorly on multi-
class problems. Our experimental results, on public-
domain datasets, show the competitiveness of our
techniques when compared with previous discretiza-
tion approaches. As future work, we plan to devise
supervised versions of the proposed methods for dis-
cretization and selection.
REFERENCES
Bolon-Canedo, V., Seth, S., Sanchez-Marono, N., Alonso-
Betanzos, A., and Principe, J. (2011). Statistical de-
pendence measure for feature selection in microar-
ray datasets. In 19th Europ. Symp. on Art. Neural
Networks-ESANN2011, pages 23–28, Belgium.
Dougherty, J., Kohavi, R., and Sahami, M. (1995). Super-
vised and unsupervised discretization of continuous
features. In International Conference Machine Learn-
ing — ICML’95, pages 194–202. Morgan Kaufmann.
Escolano, F., Suau, P., and Bonev, B. (2009). Information
Theory in Computer Vision and Pattern Recognition.
Springer.
Ferreira, A. and Figueiredo, M. (2011). Unsupervised joint
feature discretization and selection. In 5th Iberian
Conference on Pattern Recognition and Image Anal-
ysis - IbPRIA2011, pages LNCS 6669, 200–207, Las
Palmas de Gran Canaria, Spain.
Guyon, I. and Elisseeff, A. (2003). An introduction to vari-
able and feature selection. Journal of Machine Learn-
ing Research, 3:1157–1182.
Guyon, I., Gunn, S., Nikravesh, M., and Zadeh (Editors), L.
(2006). Feature Extraction, Foundations and Applica-
tions. Springer.
Guyon, I., Weston, J., and Barnhill, S. (2002). Gene se-
lection for cancer classification using support vector
machines. Machine Learning, 46:389–422.
Meyer, P., Schretter, C., and Bontempi, G. (2008).
Information-theoretic feature selection in microarray
data using variable complementarity. IEEE Journal
of Selected Topics in Signal Processing (Special Is-
sue on Genomic and Proteomic Signal Processing),
2(3):261–274.
Peng, H., Long, F., and Ding, C. (2005). Feature selec-
tion based on mutual information: Criteria of max-
dependency, max-relevance, and min-redundancy.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 27(8):1226–1238.
Saeys, Y., Inza, I., and Larra˜naga, P. (2007). A review of
feature selection techniques in bioinformatics. Bioin-
formatics, 23(19):2507–2517.
KDIR 2011 - International Conference on Knowledge Discovery and Information Retrieval
468