(if any) was kept constant. Driving stages with sec-
ondary task were labeled as high workload conditions
while driving stages without secondary task were la-
beled as low workload condition. There are three
different types of secondary tasks: A visual search
task in two different difficulty levels (Visual1 and Vi-
sual2), a math task with two difficulty levels (Divide1
and Divide2) and a game of Tic Tac Toe against the
computer (TTT). All secondary tasks were presented
on a monitor to the right of the subject and operated
by keyboard within easy reach. Each condition (driv-
ing only and driving with each of the secondary tasks)
was recorded twice for six minutes each. The order of
tasks was randomized between sessions to eliminate
order effects. During each task, EEG was recorded
using an Emotiv EPOC device. This wireless device
offers a fixed layout of 14 saline electrodes sampled at
128Hz. It can be fully set up in less than two minutes
by the user without help, which constitutes a benefit
for our aim of preparation-free workload recognition
compared to classic EEG caps. The user was told to
concentrate on the task but was not instructed other-
wise (e.g. on artifact avoidance) to record data under
realistic conditions. In total, we collected 10 sessions
with 60 minutes of EEG data each, resulting in a total
corpus of 600 minutes of usable data.
The baseline system for session-dependent work-
load recognition is described in (Heger et al., 2010).
From each window of 2 seconds, it extracts 28 spec-
tral features in the range from 4 to 45 Hz for each
electrode. The window is shifted with an overlap of
1.5s over the data stream, resulting in one data-point
for each 0.5s. Before the spectral feature extraction,
we perform an automatic removal of eyeblink arti-
facts based on Independent Component Analysis as
described in (Jarvis et al., 2011) and a Canonical Cor-
relation Analysis (Clercq et al., 2006) to remove EMG
artifacts. Two classes of low and high workload are
discriminated by a binary classifier based on Linear
Discriminant Analysis (LDA). Results are smoothed
over 3 consecutive data-points to get a more reliable
workload estimate.
To achieve session independence for this baseline
system, we follow two main approaches:
Session Adaptation: One way to handle differ-
ences between trained models and testing data is to
actively adapt the classification model to the condi-
tions of the current session. (Vidaurre et al., 2008)
propose an unsupervised adaptation of joint statistics
for both classes. The update of the selected method
modifies the joint class mean µ(t) for a newly calcu-
lated feature vector x(t) as follows:
µ(t) = (1 − UC)· µ(t − 1)+UC · x(t) (1)
The joint mean is used to correct the bias in the
feature distribution of the testing session. In for-
mula 1, UC is the update coefficient that determines
the strength of the update. Tuning the update coef-
ficient to a correct level is a crucial aspect of this
method. The approach in (Vidaurre et al., 2008) was
designed to account for non-stationarities within one
session and therefore uses a continuous update for the
whole data stream. This seems non-optimal for adap-
tation between training sessions and testing sessions
(which we assume to be stable due to their length of
only a few minutes) for several reasons: First, a user
expects a working system after a calibration phase of
minimal duration. An update coefficient which is op-
timized to adapt the model to slow changes in the sig-
nal characteristics may result in too timid updates for
inter-session adaptation. Second, when the optimal
UC is estimated and evaluated on sessions of a fixed
length it may be a suboptimal choice for sessions of
very different duration. Therefore, we only perform
adaptation on the first feature vectors of a session and
keep the model constant after that. We call the num-
ber of features used for adaptation adaptation count
(AC).
Robust Feature Accumulation: The quality of
session-independent recognition highly depends on
the quality and variety of the available training data.
A large training set can cover a wide range of possi-
ble feature distributions and account for variability in
the test set. Therefore, we can expect a more reliable
recognition with multiple training sessions than with a
limited training set. Of course, acquiring such a train-
ing set for each user is opposed to the goal of min-
imizing the effort of data collection, i.e. we have to
do a cost-benefit analysis of the addition of new train-
ing sessions and also have to find ways to extract reli-
able models already from smaller training sets. Each
recorded stage in a session is 6 minutes long, result-
ing in 1,440 training samples per session for train-
ing a quadratic covariance matrix of 392 dimensions
(14 channels with 28 features each), resulting in more
than 150,000 coefficients. This mismatch may result
in overfitted models which are tuned towards the spe-
cific conditions of the training data but which do not
generalize to other sessions. To mitigate this prob-
lem, we employ feature selection which tries to iden-
tify the most relevant features for a classification task.
We employ a wrapper approach based on Mutual In-
formation (MI) as described by (Ang et al., 2008).
They describe the Mutual Information based Best In-
dividual Feature (MIBIF) algorithm, a feature selec-
tion approach based on a high relevance criterion to
reduce the feature space dimensionality. It selects the
Session-independentEEG-basedWorkloadRecognition
361