variables were selected for all optimal subsets; all
the discriminations reached less than 7.5% LOOR
and were very significant. The 1
st
and 2
nd
cases show
that this approach is also applicable for lower
dimensional variable spaces (p=40;N=80) as well as
high-dimensional ones (3
rd
and 4
th
cases). As shown
in figure 1 for p=79 and figure 3 the LOOR is much
larger when only the 2
nd
step of VSS is applied alone
(green line) to all original variables, than the LOOR
achieved when both steps are applied jointly (blue
line). Although the LOOR increase in the absence of
the 1
st
step is less evident for p=40 (figure 1), the
optimal solution is still achieved when both steps are
applied jointly. Therefore, it can be concluded from
these results that the proposed algorithm reduces the
number of variables efficiently as well as decreases
the discrimination error.
Real BCI data results, on figure 2, show three
good discrimination cases (LOOR lower than 16%)
for three different subjects. The presence of just few
relevant variables in these BCI datasets seems likely
once 4 (subject JF) and 7 (subject JI) variables out of
72 were selected for the optimal subset in figure 3.
As suggested in the literature (Babiloni, 1999), for
all cases but one in figure 2, the best time window
for classification appears to be the first second after
cue.
Our findings show a novel mean to down-select
variables in BCI that accomplishes both
discriminative power and dimensionality reduction.
Such a strategy is valuable in decreasing the
computational complexity of neural prosthetic
applications.
ACKNOWLEDGEMENTS
N. S. Dias is supported by the Portuguese
Foundation for Science and Technology under Grant
SFRH/BD/21529/2005 and Center Algoritmi. S. J.
Schiff and M. Kamrunnahar were supported by a
Keystone Innovation Zone Grant from the
Commonwealth of Pennsylvania, and S. J. Schiff by
NIH grant K02MH01493.
REFERENCES
Babiloni, C., Carducci, F., Cincotti, F., Rossini, P.M.,
Neuper, C., Pfurtscheller G., Babiloni, F., 1999.
Human movement-related potentials vs
desynchronization of EEG alpha rhythm: A high-
resolution EEG study. In NeuroImage, vol. 10, pp.
658-665.
Dias, N.S., Kamrunnahar, M., Mendes, P.M., Schiff, S.J.,
Correia, J.H. Customized Linear Discriminant
Analysis for Brain-Computer Interfaces. In Proc.
CNE '07 IEEE/EMBS 2-5 May 2007, pp. 430-433.
Dias, N.S., Kamrunnahar, M., Mendes, P.M., Schiff, S.J.,
Correia, J.H. Comparison of EEG Pattern
Classification Methods for Brain-Computer Interfaces.
In Proc.29th EMBC 22-26 Aug 2007, pp.2540-2543.
Dillon, W.R., Mulani, N., Frederick, D.G., 1989. On the
Use of Component Scores in the Presence of Group
Structure. JOURNAL OF CONSUMER RESEARCH,
vol. 16, pp. 106-112.
Duda, R.O., Hart, P.E., Stork, D.G., 2000. Pattern
Classification. Wiley.
John, G.H., Kohavi, R., Pfleger, K., 1994. Irrelevant
Features and the Subset Selection Problem. In Proc.
11th Int. Conf. Machine Learning, 121-129.
Jolliffe, I.T., 2002. Principal Component Analysis,
Springer 2
nd
edition.
Kamrunnahar, M., Dias, N.S., Schiff, S.J. Model-based
Responses and Features in Brain Computer Interfaces.
In Proc. 30
th
IEEE EMBC 20-25 Aug 2008, pp.4482-
4485.
Lai, C., Reinders, M.J.T., Wessels, L., 2006. Random
subspace method for multivariate feature selection.
Pattern Recognition Letters, no.27, pp.1067-1076.
Schiff, S.J., Sauer, T., Kumar, R., Weinstein, S.L., 2005.
Neuronal spatiotemporal pattern discrimination: The
dynamical evolution of seizures. NEUROIMAGE, 28
ed, pp. 1043-1055.
Wolpaw, J.R., McFarland, D.J., Vaughan, T.M., 2000.
Brain–Computer Interface Research at the Wadsworth
Center. IEEE TRANSACTIONS ON REHAB.
ENGINEERING, vol. 8, no. 2, pp. 222-226.
Yu, L., Liu, H., 2004. Efficient feature Selection via
Analysis of Relevance and Redundancy. Journal of
Machine Learning, no. 5, pp. 1205-1224.
APPENDIX
This section details the dimensionality reduction
implemented on the proposed variable selection
algorithm. The algorithm presented on the bottom of
this section enumerates every command of this
procedure.
On the line 1 of the algorithm, Y is decomposed
through SVD into 3 matrices: U
n×p
(component
orthogonal matrix), S
p×p
(singular value diagonal
matrix) and V
p×p
(eigenvector orthogonal matrix).
The eigenvalues vector
λ
is calculated on line 2 as
the diagonal of S
2
. The AGV score is calculated for
every PC through lines 4 to 6.
Once it is considered that both groups to
discriminate have the same covariance matrix, the
pooled covariance matrix should be calculated as the
within group covariance matrix Ψ
Within
:
VARIABLE SUBSET SELECTION FOR BRAIN-COMPUTER INTERFACE - PCA-based Dimensionality Reduction and
Feature Selection
39