shows more spontaneity, whereas the second set of
statements is produced under notorious hesitation,
and the speaker introduces more pauses and larger
number of fillers, as this second opinion has to be
somehow "fabricated". The fillers consist in the
emission of long vowels, mainly /ah/, /uh/, /eh/,
which have been found very useful for the analysis
of the stress manifested in the stiffness of the body
fold and cover. The frequency with which fillers as
/eh/ are to be found in Spanish is larger than /ah/.
Therefore the analysis is concentrated on /eh/ rather
than /ah/. The database has been validated with /ah/,
but as the articulation patterns are removed during
vocal tract inversion both types of emissions may be
considered compatible as far as stiffness estimates
are concerned. Therefore the four study cases
presented shown comparisons of emissions of /eh/ as
given in figure 6 to Figure 9. They show the
analysis of a filler /eh/ from a male and female
speaker expressing spontaneous (MSS/FSS) and
opposite-to-spontaneous (MSO/FSO) opinions. The
evolution of vocal fold body stiffness in a 0.2 s
segment (red) and the same trace low pass filtered
and unbiased (blue) are given in the left column
(top). The statistical distribution box plot of the
unbiased vocal fold body stiffness is given in the
upper right. The evolution of the three cyclicality
estimates and their statistical distributions (medians
given in figures) are in the bottom left and right
templates, respectively.
The results given in the above four figures are
also summarized in Table 1. Several facts have to be
pinpointed from the figures corresponding to MSS
and MSO. The first one is that the body stiffness
seems to be less stable and shows a wider decay in
the spontaneous utterance in the male case. This can
also be confirmed by the standard deviation for this
parameter (σKb) in Table 1. This could be
associated with a less stressed phonation condition
when the speaker is spontaneous than when has to
"fabricate" a fictitious opinion, although the mean
tension of the vocal fold (μKb) remains almost the
same. The situation is not the same in the female
case studied as far as the body stiffness is concerned,
but if the cover stiffness is examined the larger
dispersion in σKc for MSS and FSS (spontaneous)
compared with the non-spontaneous MSO and FSO
is evident for both genders. The comparison of the
cyclic parameters in the spontaneous vs non-
spontaneous is that the first one (c1) shows a decay
towards -1 that is almost twice larger in the female (-
0.8 to -0.92) than in the male case (-0.8 to -0.86),
whereas c2 moves down as well in both cases (-0.01
to -0.18 for the male speaker, and 0.1 to -0.2 for the
female speaker). The third cyclicality parameter
does not show such a clear orientation, although is
supposed to be larger in both cases for the non-
spontaneous behaviour. It is interesting to comment
that the first cyclicality coefficient tendency towards
the lower limit is also present in speakers affected by
certain neurological diseases when tremor is present
(spasmodic dysphonia, Parkinson Disease, see
Gomez et al. 2011). The fact that cover stiffness
dispersion shows to be larger in spontaneous
phonation could be interpreted as that the speaker
leaves the vocal folds go looser under less stressed
conditions (spontaneous phonation) than under self-
controlled and more stressed a situation (non-
spontaneous phonation). A second observation is
that the average stiffness is not very much altered
from one situation to the other, but its dispersion is
clearly different (lower under non-spontaneous
conditions), and that the first two cyclicality
parameters show also a clear difference between
both conditions. The reasons for these variations to
be larger in the female case need not be necessarily
related to gender, but possibly to the specific
idiopathic behaviour of the speaker, although this
issue is worth to be assessed with a larger database
of spontaneous vs. non-spontaneous utterances
produced by both gender speakers.
5 CONCLUSIONS
A first observation is that the chain model from
voice to vocal fold tension estimation and to the
cyclicality parameters of vocal fold stiffness
normophonic and organic pathology-affected
speakers on neutral emotional conditions seems to
behave accordingly with the main assumptions
formulated. The statistical distributions for male and
female speakers not affected by neurological
diseases show certain coherence, and are defined
enough to allow contrastive studies to be carried out.
From this observation it may be concluded also
that the estimation methodology both for the glottal
source, its biomechanical correlates, and the cyclical
parameters seems to be robust enough to extend the
study to larger databases of speakers showing
emotionally distorted phonation. Coming to the
detection of emotions in voice it seems that the
contrastive study on spontaneous to non-
spontaneous speech may offer differential marks in
the dispersion of body, and specially cover fold
stiffness, and in the cyclicality parameters derived
from body stiffness. Obviously the study is far from
BIOSIGNALS2013-InternationalConferenceonBio-inspiredSystemsandSignalProcessing
116