Prerequisites for Affective Signal Processing (ASP) –

Part IV

Egon L. van den Broek

, Marjolein D. van der Zwaag

, Jennifer A. Healey

Joris H. Janssen

2,4

and Joyce H. D. M. Westerink

Human-Centered Computing Consultancy, Austria

User Experience Group, Philips Research Europe

High Tech Campus 34, 5656 AE Eindhoven, The Netherlands

Future Technology Research, Intel Labs Santa Clara

Juliette Lane SC12-319, Santa Clara CA 95054, U.S.A.

Deptartment of Human Technology Interaction, Eindhoven University of Technology

P.O. Box 513, 6500 MB Eindhoven, The Netherlands

Abstract. In [1–3], a series of prerequisites for affective signal processing (ASP)

was deﬁned: validation (e.g., mapping of constructs on signals), triangulation, a

physiology-driven approach, contributions of the signal processing community,

identiﬁcation of users, theoretical speciﬁcation, integration of biosignals, and

physical characteristics. This paper deﬁnes three additional prerequisites: histor-

ical perspective, temporal construction, and real-world baselines.

1 Introduction

In his book The emotion machine: Commonsense thinking, artiﬁcial intelligence, and

the future of the human mind, [4] stated: . ..emotion is one of those suitcaselike words

that we use to conceal the complexity of very large ranges of different things whose

relations we don’t yet comprehend. Five pages later, he suggests to replace ... old ques-

tions like, “What sorts of things are emotions and thoughts?” by more constructive ones

like, “What processes does each emotion involve?” and “How could machines perform

such processes?” Affective computing (AC) aims to answer these questions through

processing signals that correlate with emotions: affective signals processing (ASP).

ASP can be employed from (a combination of) biosignals, movement analysis, com-

puter vision, and speech processing. However,the techniques other than biosignals have

major disadvantages [1–3]. In contrast, such issues have been resolved for biosignals in

recent years: currently, it is easy to obtain, high ﬁdelity, cheap, and unobtrusive biosig-

nal recordings; e.g., see [5]. Moreover, the recording devices can be easily integrated in

various products [6]. Therefore, this paper focusses on biosignals. For an overview of

the most commonly used biosignals and their features, we refer to [1].

L. van den Broek E., D. van der Zwaag M., A. Healey J., H. Janssen J. and H. D. M. Westerink J. (2010).

Prerequisites for Affective Signal Processing (ASP) – Part IV.

In Proceedings of the 1st International Workshop on Bio-inspired Human-Machine Interfaces and Healthcare Applications, pages 51-58

DOI: 10.5220/0002813200510058

 SciTePress

This prerequisites paper is designed to discuss unsolved issues related to ASP and to

introduce a framework for future research. It is not designed as a paper on novel meth-

ods in signal processing, but rather on the speciﬁc issues on applying those methods

to the problem of ASP. A particular focus is on the problem of ASP in the real world,

with long latency signals (e.g., electrodermal activity; EDA), and affective responses

that are ambiguously deﬁned in time and that often depend on previous events and are,

therefore, neither linear nor time invariant in their responses. Much of (traditional) sig-

nal processing relies on the linear time invariant assumption. Real affective responses

do not ﬁt this description. Consequently, ASP requires its own set of prerequisites as

they are denoted in this paper and the other prerequisites papers of [1–3].

For AC, a broad plethora of classiﬁers is used as part of the ASP. The classiﬁcation

performances are hard to compare since the emotion classes used are typically deﬁned

in different ways. Additionally, the number of emotion classes to be discriminated is

small, it ranges from 2 to 6. Nevertheless, the results are behind that of other classiﬁca-

tion problems. With AC recognition rates << 90% are common, where in most other

pattern recognition problems, recognition rates of > 90% and often > 95% are often

reported. This illustrates AC’s complex nature and the need for a comprehensivereview

of the prerequisites involved.

To force a breakthrough in results on AC we propose a set of prerequisites for ASP,

before starting with AC in practice. The ﬁrst three parts of these prerequisites were

introduced in [1–3]. In the next section, the fourth part is introduced. Together, these

prerequisites should form the foundation for more successful ASP and AC. We end this

paper with a brief conclusion.

2 Prerequisites – Part IV

In [1–3], the following prerequisites for ASP were introduced: validity, triangulation, a

physiology-driven approach, contributions from signal processing, user identiﬁcation,

theoretical speciﬁcation, integration of biosignals, and physical characteristics. While

each of these is still of the utmost importance for ASP, we will now denote three addi-

tional ones: historical perspective, temporal construction, and real-world baselines.

2.1 History: Lessons to be Learned and Experiences to Remember

Centuries ago, the relation between physiologicalreactions, as expressed through biosig-

nals, and emotions was already mentioned by poets and ancient philosophers. This re-

sulted in a plethora of deﬁnitions, almost impossible to list and illustrates the complexity

of the concept emotion; cf. [4].

Although much knowledgeon emotions is gained overthe last centuries, researchers

tend to ignore this up to a high extent and stick to some relatively recent theories; e.g.,

the valence and arousal model or the approach avoidance model. This holds in partic-

ular for ASP and AC, where an engineering approach is dominant and a theoretical

framework is considered of lesser importance [2]. Consequently, for most engineering

approaches, the valence-arousal model is applied as a default option, without consider-

ing other possibilities.

It is far beyond the scope of this paper to provide a complete overview of all litera-

ture relevant for ASP and AC. For such an overview, we refer to the various handbooks

and review papers on emotions, affective sciences, and affective neuroscience; e.g., [7–

9]. In this section, we will touch some of the major works on emotion research, which

origin from medicine, biology, physiology, and psychology.

Let us start with one of the earliest works on biosignals: De l’

Electricit

e du corps

humain of M. l’Abb´e Bertholon (1780), who already described human biosignals. One

century later Darwin (1872) published his book The Expression of Emotions in Man and

Animals [8]. Subsequently, independentlyof each other,William James and C. G. Lange

revealed their theories on emotions, which were remarkably similar [8]. Consequently,

their theories has been merged and were baptized the James-Lange theory.

In a nutshell, the James-Lange theory argues that the perception of our own biosig-

nals is the emotion. Consequently,no emotions can be experiencedwithout these biosig-

nals. Two decades after the publication of James’ theory, it was already seriously chal-

lenged by [11,12] and [13,14]. They emphasized the role of subcortical structures

(e.g., the thalamus, the hypothalamus, and the amygdala) in experiencing emotions.

Their rebuttal on the James-Lange theory was founded on ﬁve notions:

1. Compared to a normal situation, experienced emotions are similar when biosignals

are omitted; e.g., as with the transection of the spinal cord and vagus nerve.

2. Similar biosignals emerge with all emotions. So, these signals cannot cause distinct

emotions.

3. The bodies internal organs have fewer sensory nerves than other structures. Hence,

people are unaware of their possible biosignals.

4. Generally, biosignals have a long latency period, compared to the time emotional

responses are expressed.

5. Drugs that trigger biosignals to emerge do not necessarily trigger emotions in par-

allel.

We will now address each of Cannon’s notions from the perspective of ASP. As

will become apparent, considering these notions with current ASP is of importance. To

the authors knowledge, the ﬁrst case that illustrated both theories weaknesses was that

of a patient with a lesion, as denoted in Cannon’s ﬁrst notion. This patient reported :

Sometimes I act angry when I see some injustice. I yell and cuss and raise heel, because

if you don’t do it sometimes, I learned people will take advantage of you, but it just

doesn’t have the heat to it that it used to. It’s a mental kind of anger (p. 151) [15].

Moreover, this case clearly illustrated the use of such special cases, as is denoted in [2].

The second notion of the Cannon-Bard theory strikes the essence of ASP. It would

imply that the quest of affective computing is deemed to fail. According to Cannon-

Bard, ASP is of no use since no unique sets of biosignals exist that map to distinct

emotions. Luckily, nowadays, this statement is judged as coarse [8]. However, it is

generally acknowledged that it is very hard to apply ASP successfully [7]. So, (at least)

to a large extent Cannon was right.

It was conﬁrmed that the number of sensory nerves differs in distinct structures in

human bodies (Cannon’s notion 3). So, indeed people’s physiological structures de-

termine their internal variations to the emotional sensitivity. To make ASP even more

challenging, cross-cultural and ethnic differences exist in people’s patterns of biosig-

nals, as was already shown by [16]

The fourth notion concerns the latency period of biosignals, which Cannon denoted

as being ‘long’. In the next section we address this problem.

The ﬁfth and last notion of Cannon is one that is not addressed so far. It goes be-

yond biosignals since it concerns the neurochemical aspects of emotions. Although this

component of human physiology can indeed have a signiﬁcant inﬂuence on experienced

emotions, this falls far beyond the scope of this paper.

It should be noted that the current general opinion among neuroscientists is that

the truth lies somewhere between the theories of James-Lange and Cannon-Bard [8].

However, the various relations between the latter notions and the set of prerequisites,

illustrates that these notions, although a century old, are still of interest for current AC

and ASP.

2.2 Temporal Construction

There are many temporal aspects in biosignals that should be taken into account in

ASP. These aspects can be categorized in three classes: psychological, physiological,

and signal processing aspects.

The psychological aspect has to do with habituation; in general, every time a stim-

ulus is perceived one’s reaction to it will get smaller. With large delays between the

stimuli, one recovers from the habituation effect. There are several ways of dealing with

this in ASP. One way is to keep track of the moments in which stimuli were present.

This information can then be used to predict how strong the effect of a similar stimulus

will be. Alternatively, in applications where stimuli presentation can be controlled, the

variety of the stimuli can be directed such that habituation effects are canceled.

The ﬁrst physiological aspect deals with the fact that the affective signals can be

processed in different time windows. For instance, we can look at parts of 30 minutes

but also at 10, 30, or 60 seconds. There are many challenges in modeling the temporal

aspects of emotion. One is the annotation challenge of determining when the emotion

begins and when it ends. Another is the sensor fusion problem of determining how

to window individual signals within the emotional event since different signals have

different latencies. In response to a high arousal event, an instant gasp may occur in

respiration and a tensing of muscles, heart rate will then increase in the next few sec-

onds and EDA should start to rise and may continue to rise for several minutes. Using

the same window and offset for all signals would not capture the most salient discrimi-

nating features of the experience. So, in general, biosignal features calculated over time

windows with different length cannot be compared with each other.

That being said, time window selection is often done empirically; i.e., many dif-

ferent time windows are tried and those leading to the best results are used in the ﬁnal

models [17, 6]. Other automatic options include ﬁnding the nearest signiﬁcant local

minima or making assumptions about the start time and extend of the emotion; e.g.,

an average over previous emotions. In addition, another empirical solution is to ask the

Author’s note. Nowadays, this paper would run up to resistance, as it denotes both ethnical

issues and as its subjects. Perhaps that is why so little work is done on this topic.

user to deﬁne the window of interest. This can be done through sliders, as for real world

research can be presented on a PDA. However, also this approach has its downside: in

general, people’s introspection is not good and you do not want to bother users with

these tasks. The physiological response to an emotion may have started well before the

person realized that they were in this state, so if a single annotation is used, it will def-

initely come after the start of the experience. Moreover, in the real world the temporal

nature of the reaction to the stimulus is undetermined. A uniform window may not be

appropriate.

There are also valuabletheoretical considerations. Differentpsychological processes

develop over different time scales. On the one hand, emotions lead to very short and fast

phasic changes and, thus, require short time windows. On the other hand, changes in

mood are more gradual tonic and, so, require broader time windows. In general, the time

window used should depend on the psychological construct studied. Furthermore, there

is always a lag between the psychological change and the physiological change. These

lags differ per signal: heart rate changes almost immediately while skin temperature can

take more than a minute to change. Skin conductance is somewhere in between. This

shows the need for different time windows for different signals.

A second physiological aspect stems from the idea that physiological activity tends

to move to a stable neutral state; i.e., when the physiological level is high, it tends to

decrease; whereas, when the physiological level is low, it tends to increase. Hence, the

effect of a stimulus on physiology depends on the physiological level before stimulus

onset; i.e., the principle of initial values [9]. When you perceive a scary stimulus and

your heart rate is at 80 it might increase by 15 beats, however, when you heart rate is

at 160 it is unlikely to increase at all. As this is found to be a linear relationship, it can

be modeled by linear regression. The ﬁrst step is to assess the regression line, which is

different per feature and person. Next, this regression line can be used to correct each

feature by computing its residualized value.

A consideration speciﬁc to ASP is that emotional responses are likely comprise a

layered response involving components that have different time periods including: dis-

position (long term - years), circumstance (days), mood (hours) and emotion (seconds).

An accurate model of an individuals affective response to these varying time inﬂuences

is difﬁcult to determine, even the totality of inﬂuences are difﬁcult to catalog in the

real world. Therefore, also for this reason, a major consideration in ASP is choosing a

window length appropriate to the type of affective response you are considering.

2.3 Real-world Baselines

Baselining is the process of correcting the biosignal to a standard level that is compa-

rable over users and/or sessions (also called normalization/standardization). Finding an

appropriate baseline is both important and difﬁcult for sensors whose readings depend

on factors that can easily change on a daily basis; e.g., sensor placement, humidity,

temperature, and the use of contact gel [3]. Still, baselining over multiple people or

multiple days is required for ASP in order to compare and combine data from these dif-

ferent sources in a meaningful way. Many different approaches to baselining are known

in the literature. However, as will be shown, we require affect speciﬁc approaches.

Table 1. Seven methods of using baseline information to normalize the signal. x denotes the

original signal and ˜x the corrected signal. µ

, min

, max

, and σ

are respectively the mean,

minimum, maximum, and standard deviation of the baseline. Sources of information: 1:[18],

3,5,6:[19] and 7:[20].

1 ˜x

= x

− µ

Standard correction, often used in psychological experiments.

2 ˜x

= x

− min

Useful alternative to the ﬁrst method when there is no

relaxation period and a lot of variance in the signal.

3 ˜x

= (x

− µ

)/σ

Strong baselining method; works best for continuous signals.

4 ˜x

= (x

− µ

)/µ

5 ˜x

= (x

− min

)/(max

− min

) Sensitive to outliers.

6 ˜x

= x

/max

Used for Skin conductance responses features.

7 ˜x

= (x

× 100)/µ

) − 100 Used for facial EMG measurements.

With ASP we have to handle long term continuous (re-)baselining of biosignals.

There exists no guideline on how to apply these methods to continuous physiological

data in the real world. An exception to his is [21], which discusses ECG recording in

ambulatory settings; however, it does not focus on ASP. In this section, we discuss how

to apply known methods from the laboratory to a new situation, continuous ambulatory

monitoring in the real world. We also discuss how to apply these methods to affective

reactions of varying length and intensity in the presence of noise. Some of the methods

commonly used in other types of signal processing, such as “zeroing the mean” and

“dividing by the variance” do not work for long term physiological records, which is

the problem we are trying to bring to light. In the following paragraphs, we will try

to give an overview of the different baselining approaches and explain when they are

appropriate. We also call for empirical comparisons of different baseline methodologies

speciﬁc to ASP in the real world, as this is still lacking.

The two main issues with baselining are (1) the selection of a suitable correction

method and (2) the selection of a period over which to calculate the parameters of

the selection method (the baseline period). The correction methods are summarized

in Table 1. Once the baseline is removed, it becomes the new base (or zero) and the

original value is lost. Each baseline has different merits. Taking the minimum baseline

is more equivalentto taking the resting EDA that would normally be used in a laboratory

experiment. This is the best method if a consistent minimum seems apparent in all

data being combined. The problem is that for each data segment, a minimum must be

apparent. It is straightforward to eliminate point outliers such as those at 3.7 hours and

3.9 hours and ﬁnd a more robust minimum for the baseline. An other often used method

for continuous signals like EDA and skin temperature is called standardization (method

3 in Table 1) [19]. This is probably the most powerful correction method and is applied

very often. It corrects not only for the baseline level but also for the variation in the

signal, making it more robust. Other correction methods are tailored to speciﬁc features;

e.g., the amplitude of skin conductance responses is often corrected by dividing by the

maximum amplitude. Taken together, different signals and situations require different

correction methods, which should be chosen carefully.

The second issue in baselining is the selection of an appropriate time-window over

which the parameters for the correction are calculated. For short term experiments, a

single baseline period is usually sufﬁcient. However, when monitoring continuously,

the baseline may have to be re-evaluated with greater frequency. The challenge here is

to ﬁnd a good strategy for dividing the signal into segments over which the baseline

should be re-calculated. A simple solution is to use a sliding window; e.g., where the

last 30 minutes are taken into account. In this case, it seems obvious that the segments

should be considered independently, since the time period between the two is long and

it may be that the electrodes fell off and may have been re-applied. To be able to conduct

proper normalization when signal loss is shorter than the baselining window, the period

right before the signal loss can for example be used to complement the baseline window

of the current signal. However, in general, data segmentation has not had much attention

in ASP and long term continuous (re-)baselining of biosignals is still an open problem.

As with all pattern recognition pipelines, baselining (or normalization) is of utmost

importance. Most efforts towards AC and ASP have not paid much attention to this.

We hope that this prerequisite is a start for the development of more sophisticated algo-

rithms that can deal with the difﬁcult problems of data segmentation, as they have been

dealt with in other research ﬁelds like computer vision.

3 Conclusions

This paper explains the importance of prerequisites speciﬁcally for ASP and introduces

the fourth part of a series of such prerequisites for ASP. The prerequisites foundation

in historical perspective, adequate temporal construction, and well-chosen real-world

baselines are introduced. These prerequisites are complementary to those presented

in [1–3]: validity, triangulation, the physiology-driven approach, and contributions of

signal processing, identiﬁcation of users and theoretical speciﬁcation, and physical

characteristics and integration of biosignals.

The review and the prerequisites, both illustrate and explain the complexity of ASP

and its limited progress. Therefore, we advise to incorporate these prerequisites for

successful ASP, instead of running forward and ignoring the problems encountered in

previous studies. We hope that the prerequisites can contribute to or even guide future

research in ASP.

References

1. van den Broek, E.L., Janssen, J.H., Westerink, J.H.D.M., Healey, J.A.: Prerequisits for Af-

fective Signal Processing (ASP). In Encarnac¸˜ao, P., Veloso, A., eds.: Biosignals 2009: Pro-

ceedings of the International Conference on Bio-Inspired Systems and Signal Processing,

Porto – Portugal (2009) 426–433

2. van den Broek, E.L., Janssen, J.H., Healey, J.A., van der Zwaag, M.D.: Prerequisits for

Affective Signal Processing (ASP) – Part II. In: Biosignals 2010: Proceedings of the In-

ternational Conference on Bio-Inspired Systems and Signal Processing, Valencia – Spain

(2010) [in press]

3. van den Broek, E.L., Janssen, J.H., van der Zwaag, M.D., Healey, J.A.: Prerequisits for

Affective Signal Processing (ASP) – Part III. In: Biosignals 2010: Proceedings of the In-

ternational Conference on Bio-Inspired Systems and Signal Processing, Valencia – Spain

(2010) [in press]

4. Minsky, M.: The Emotion Machine: Commonsense Thinking, Artiﬁcial Intelligence, and the

Future of the Human Mind. New York, NY, USA: Simon & Schuster (2006)

5. Pantelopoulos, A., Bourbakis, N.G.: A survey on wearable sensor-based systems for health

monitoring and prognosis. IEEE Transactions on Systems, Man, and Cybernetics, Part C:

Applications and Reviews 40 (2010) 1–12

6. van den Broek, E.L., Westerink, J.H.D.M.: Considerations for emotion-aware consumer

products. Applied Ergonomics 40 (2009) 1055–1064

7. Boehner, K., DePaula, R., Dourish, P., Sengers, P.: How emotion is made and measured.

International Journal of Human-Computer Studies 65 (2007) 275–291

8. Dalgleish, T., Dunn, B.D., Mobbs, D.: Affective neuroscience: Past, present, and future.

Emotion Review 1 (2009) 355–368 history.

9. Davidson, R.J., Scherer, K.R., Hill Goldsmith, H.: Handbook of affective sciences. New

York, NY, USA: Oxford University Press (2003)

10. l’Abb´e Bertholon, M.: De l’

Electricit´e du corps humain. Lyon, France: Tome Premiere

(1780)

11. Cannon, W.B.: Bodily changes in pain, hunger, fear and rage: An account of recent re-

searches into the function of emotional excitement. New York, NY, USA: D. Appleton and

Company (1915)

12. Cannon, W.B.: TheJames-Lange theory of emotion: A critical examination and an alternative

theory. American Journal of Psychology 39 (1927) 106–124

13. Bard, P.: On emotional expression after decortication with some remarks on certain theoret-

ical views, Part I. Psychological Review 41 (1934) 309–329

14. Bard, P.: On emotional expression after decortication with some remarks on certain theoret-

ical views, Part II. Psychological Review 41 (1934) 424–449

15. Hohnmann, G.W.: Some effects of spinal cord lesions on experienced emotional feelings.

Psychophysiology 3 (1966) 143–156

16. Sternbach, R.A., Tursky, B.: Ethnic differences among housewives in psychophysical and

skin potential responses to electric shock. Psychophysiology 1 (1965) 241–246

17. Kim, J., Andr´e, E.: Emotion recognition based on physiological changes in music listening.

IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2008) 2067–2083

18. Llabre, M.M., Spitzer, S.B., Saab, P.G., Ironson, G.H., Schneiderman, N.: The reliability

and speciﬁcity of delta versus residualized change as a measure of cardiovascular reactivity

to behavioral challenges. Psychophysiology 28 (1991) 701–711

19. Boucsein, W.: Electrodermal activity. New York, NY, USA: Plenum Press (1992)

20. Fridlund, A.J., Cacioppo, J.T.: Guidelines for human electromyographic research. Psy-

chophysiology 23 (1986) 567–589

21. Chaudhuri, S., Pawar, T.D., Duttagupta, S.: Ambulation analysis in wearable ECG. New

York, NY, USA: Springer Science+Business Media (2009)