a potentially powerful mean for understanding user’s
needs, problems and desires.
The tool, written using the Praat scripting lan-
guage, relies on two sets of prosodic features and
two LDA-based classifiers. The experiments, per-
formed on a custom corpus of tagged audio record-
ings, showed encouraging results: for classification
of emotions, we obtained a value of about 71% for
average Pr, average Re, average F
1
, and Ac, with a
K=0.64; for classification of communication styles,
we obtained a value of about 86% for average Pr, av-
erage Re, average F
1
, and Ac, with a K=0.78.
As a future work, we plan to test other classifica-
tion approaches, such as HMM and CRF, experiment-
ing them with a bigger corpus. Moreover, we plan to
investigate text-based features provided by NLP tools,
like POS taggers and parsers. Finally, the analysis
will be enhanced according to the “musical behavior”
methodology (Sbattella, 2006; Sbattella, 2013).
REFERENCES
Anolli, L. (2002). Le emozioni. Ed. Unicopoli.
Anolli, L. and Ciceri, R. (1997). The voice of emotions.
Milano, Angeli.
Asawa, K., Verma, V., and Agrawal, A. (2012). Recognition
of vocal emotions from acoustic profile. In Proceed-
ings of the International Conference on Advances in
Computing, Communications and Informatics.
Avesani, C., Cosi, P., Fauri, E., Gretter, R., Mana, N., Roc-
chi, S., Rossi, F., and Tesser, F. (2003). Definizione ed
annotazione prosodica di un database di parlato-letto
usando il formalismo ToBI. In Proc. of Il Parlato Ital-
iano, Napoli, Italy.
Balconi, M. and Carrera, A. (2005). Il lessico emotivo nel
decoding delle espressioni facciali. ESE - Psychofenia
- Salento University Publishing.
Banse, R. and Sherer, K. R. (1996). Acoustic profiles in
vocal emotion expression. Journal of Personality and
Social Psychology.
Boersma, P. (1993). Accurate Short-Term Analysis of the
Fundamental Frequency and the Harmonics-to-Noise
Ratio of a Sampled Sound. Institute of Phonetic Sci-
ences, University of Amsterdam, Proceedings, 17:97–
110.
Boersma, P. (2001). Praat, a system for doing phonetics by
computer. Glot International, 5(9/10):341–345.
Boersma, P. and Weenink, D. (2013). Manual of praat: do-
ing phonetics by computer [computer program].
Bonvino, E. (2000). Le strutture del linguaggio: unintro-
duzione alla fonologia. Milano: La Nuova Italia.
Borchert, M. and Diisterhoft, A. (2005). Emotions in
speech - experiments with prosody and quality fea-
tures in speech for use in categorical and dimensional
emotion recognition environments. Natural Language
Processing and Knowledge Engineering, IEEE.
Caldognetto, E. M. and Poggi, I. (2004). Il parlato emotivo.
aspetti cognitivi, linguistici e fonetici. In Il parlato
italiano. Atti del Convegno Nazionale, Napoli 13-15
febbraio 2003.
Canepari, L. (1985). LIntonazione Linguistica e paralin-
guistica. Liguori Editore.
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G.,
Kollias, S., and Fellenz, W. (2001). Emotion recogni-
tion in human-computer interaction. Signal Process-
ing Magazine, IEEE.
D’Anna, L. and Petrillo, M. (2001). Apa: un prototipo di
sistema automatico per lanalisi prosodica. In Atti delle
11e giornate di studio del Gruppo di Fonetica Speri-
mentale.
Delmonte, R. (2000). Speech communication. In Speech
Communication.
Ekman, D., Ekman, P., and Davidson, R. (1994). The Na-
ture of Emotion: Fundamental Questions. New York
Oxford, Oxford University Press.
Gobl, C. and Chasaide, A. N. (2000). Testing affective cor-
relates of voice quality through analysis and resynthe-
sis. In ISCA Workshop on Emotion and Speech.
Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., and
Wedin, L. (1980). Perceptual and acoustic correlates
of voice qualities. Acta Oto-laryngologica, 90(1–
6):441–451.
Hastie, H. W., Poesio, M., and Isard, S. (2001). Automat-
ically predicting dialog structure using prosodic fea-
tures. In Speech Communication.
Hirshberg, J. and Avesani, C. (2000). Prosodic disambigua-
tion in English and Italian, in Botinis. Ed., Intonation,
Kluwer.
Hirst, D. (2001). Automatic analysis of prosody for mul-
tilingual speech corpora. In Improvements in Speech
Synthesis.
Izard, C. E. (1971). The face of emotion. Ed. Appleton
Century Crofts.
Juslin, P. (1998). A functionalist perspective on emotional
communication in music performance. Acta Universi-
tatis Upsaliensis, 1st edition.
Juslin, P. N. (1997). Emotional communication in music
performance: A functionalist perspective and some
data. In Music Perception.
Koolagudi, S. G., Kumar, N., and Rao, K. S. (2011). Speech
emotion recognition using segmental level prosodic
analysis. Devices and Communications (ICDeCom),
IEEE.
Lee, C. M. and Narayanan, S. (2005). Toward detecting
emotions in spoken dialogs. Transaction on Speech
and Audio Processing, IEEE.
Leung, C., Lee, T., Ma, B., and Li, H. (2010). Prosodic
attribute model for spoken language identification. In
Acoustics, speech and signal processing. IEEE inter-
national conference (ICASSP 2010).
L
´
opez-de Ipi
˜
na, K., Alonso, J.-B., Travieso, C. M., Sol
´
e-
Casals, J., Egiraun, H., Faundez-Zanuy, M., Ezeiza,
A., Barroso, N., Ecay-Torres, M., Martinez-Lage, P.,
and Lizardui, U. M. d. (2013). On the selection of
PhyCS2014-InternationalConferenceonPhysiologicalComputingSystems
194