ing a classifier, since this seems to vary with every
small change in the training set. Second, more fea-
tures means higher feature extraction time. Despite
the smaller number of selected features, the classifi-
cation performance was not significantly affected.
The performance of a feature selection algorithm
can also be described in terms of the total computa-
tional time that an algorithm needs to find the optimal
feature set. Here, we only analyze the time needed by
the feature selection itself. The feature extraction step
is not taken into account. mahal is nearly 10 times
faster than SFS, with 984 s and 9205 s, respectively.
By design, on each iteration SFS must redo the entire
classification step in the training set for each feature
before choosing which feature to add to the feature
set. The time increases approximately exponentially
with each new feature added to the total feature set.
In contrast, the computational time of the mahal algo-
rithm increases approximately linearly as the perfor-
mance calculations are only performed on the selected
feature subsets (step 5 of the algorithm). In addition,
the algorithm automatically restricts the list of fea-
tures that have to be tested during selection by eval-
uating their statistical power in advance, i.e., the Ma-
halanobis distance and the Spearman’s ranked-order
correlation.
Our classifier achieves a performance of κ = 0.62
in distinguishing sleep/wake, which is at least as high
as most work published so far, with fewer features
used during classification. However, the differences
in performance obtained for different subjects are too
large to be ignored. It seems from the learning curves
that this classifier is approaching its maximum per-
formance with the currently extracted features. In
order to further improve it, new approaches seem to
be needed. These could take into account, or bet-
ter yet, compensate for subject-specific differences
in the physiological expressions of different sleep
stages. Nevertheless, the feature selection algorithm
mahal described in this paper seems well-suited for
this problem since it is stable enough to be integrated
in a cross-validation procedure, also in the presence
of small data sets.
ACKNOWLEDGEMENTS
The authors thank Dr. Reinder Haakma and Sandrine
Devot, as well as Prof. Ronald Aarts for their com-
ments and careful reading of the manuscript.
REFERENCES
Abdullah, M. (1990). On a robust correlation coefficient.
The Statistician, 39(4):455–460.
Cohen, J. (1960). A Coefficient of Agreement for Nominal
Scales. Educational and Psychological Measurement,
20(1):37–46.
Davis, J. and Goadrich, M. (2006). The relationship
between Precision-Recall and ROC curves. In Pro-
ceedings of the 23rd international conference on Ma-
chine learning ICML 06, volume 10 of ICML ’06,
pages 233–240, Pittsburgh (USA). ACM Press.
Devot, S., Bianchi, A. M., Naujokat, E., Mendez, M.,
Brauers, A., and Cerutti, S. (2007). Sleep monitoring
through a textile recording system. In IEEE Engineer-
ing in Medicine and Biology Society, volume 2007,
pages 2560–2563.
Devot, S., Dratwa, R., and Naujokat, E. (2010). Sleep/wake
detection based on cardiorespiratory signals and actig-
raphy. In Annual International Conference of the
IEEE Engineering in Medicine and Biology Society
(EMBS), pages 5089–5092. IEEE.
Duda, R. O., Hart, P. E., and Stork, D. G. (2001). Pattern
Classification. Wiley, 2nd edition.
Fawcett, T. (2004). ROC Graphs: Notes and Practical Con-
siderations for Researchers. ReCALL, 31(HPL-2003-
4):1–38.
Friedman, J. H. (2012). Regularized Discriminant Analy-
sis. Journal of the American Statistical Association,
84(405):165–175.
Haibo, H. and Garcia, E. (2009). Learning from Imbalanced
Data. IEEE Transactions on Knowledge and Data En-
gineering, 21(9):1263–1284.
Long, X., Fonseca, P., Foussier, J., Haakma, R., and Aarts,
R. (2012). Using Dynamic Time Warping for Sleep
and Wake Discrimination. In IEEE Engineering in
Medicine and Biology Society - International Confer-
ence on Biomedical and Health Informatics (BHI),
volume 25, pages 886–889, Hong Kong/Shenzhen
(China).
Provost, F., Fawcett, T., and Kohavi, R. (1998). The case
against accuracy estimation for comparing induction
algorithms. In Proceedings of the 15th International
Conference on Machine Learning, volume 445. JS-
TOR.
Rechtschaffen, A. and Bergmann, B. (1995). Sleep depri-
vation in the rat by the disk-over-water method. Be-
havioural Brain Research, 69(1-2):55–63.
Redmond, S. J., de Chazal, P., O’Brien, C., Ryan, S., Mc-
Nicholas, W. T., and Heneghan, C. (2007). Sleep
staging using cardiorespiratory signals. Somnologie,
11(4):245–256.
Whitney, A. (1971). A Direct Method of Nonparametric
Measurement Selection. IEEE Transactions on Com-
puters, C-20(9):1100–1103.
Zoubek, L., Charbonnier, S., Lesecq, S., Buguet, A.,
and Chapotot, F. (2007). Feature selection for
sleep/wake stages classification using data driven
methods. Biomedical Signal Processing and Control,
2(3):171–179.
BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms
184