EXPRESSIVE SPEECH IDENTIFICATIONS BASED ON HIDDEN MARKOV MODEL

Syaheerah L. Lutfi, J. M. Montero, R. Barra Chicote, J. M. Lucas-Cuesta, A. Gallardo-Antolín

2009

Abstract

This paper concerns a sub-area of a larger research field of Affective Computing, focusing on the employment of affect-recognition systems using speech modality. It is proposed that speech-based affect identification systems could play an important role as next generation biometric identification systems that are aimed at determining a person’s ‘state of mind’, or psycho-physiological state. The possible areas for the deployment of voice-affect recognition technology are discussed. Additionally, the experiments and results for emotion identification in speech based on a Hidden Markov Models (HMMs) classifier are also presented. The result from experiment suggests that certain speech feature is more precise to identify certain emotional state, and that happiness is the most difficult emotion to detect.

References

  1. Barra, R., Montero, J. M., Macias, J., D'Haro, L. F., Segundo, R. S. & Cordoba, R. D. (2006) Prosodic and segmental rubrics in emotion identification. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toulouse.
  2. Benzeghiba, M. F., Bourlard, H. & Mariethoz, J. (2001) Speaker verification based on user-customized password. . Institut Dalle Molle d'Intelligence Artificial Perceprive.
  3. Bullington, J. (2005) 'Affective' computing and emotion recognition systems: The future of biometric surveillance? Information Security Curriculum Development (InfoSecCD). Kennesaw, GA, USA.
  4. Campbell, N. (2000) Database of Emotional Speech. ISCA Workshop on Speech and Emotion. Belfast.
  5. Castellanos, G., Delgado, E., Daza, G., Sanchez, L. G. & Suarez, J. F. (2006) Feature Selection in Pathology Detection using Hybrid Multidimensional Analysis. IEEE 2006 International Conference of the Engineering in Medicine and Biology Society (EMBS 7806). NY, USA.
  6. Darpa (2003) Integrated system for emotion recognition for the enhancement of human performance detection of criminal intent. DARPA SB032-038. USA, DARPA.
  7. Donovan, R. E. & Eide, E. (1998) The IBM trainable speech synthesis system. ICSLP 98.
  8. Donovan, R. E. & Woodland, P. C. (1995) Automatic speech synthesiser parameter estimation using HMMS. International Conference on Acoustic Speech Signal Processing.
  9. Douglas-Cowie, E., Cowie, R. & Schroder, M. I. (2000) A new emotion database: Considerations, sources and scope. ISCA Workshop on Speech and Emotion. New Castle, UK.
  10. Gamboa, H. & Fred, A. (Eds.) (2003) An identity authentication system based on human computer interaction behaviour, ICEISS Press.
  11. Gamboa, H. & Fred, A. (2004) A Behavioural Biometric System Based on Human Computer Interaction. Proc. of SPIE, 5404.
  12. Gray, M. (2003) Urban Surveillance and Panopticism: will we recognize the facial recognition society? Surveillance and Society, 1, 314-330.
  13. Gunter, S. & Bunke, H. (2004) HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components. Pattern Recognition, 37, 2069-2079.
  14. Hansen, J. H. L., Bou-Ghazale, S. E., Sarikaya, R. & Pellom, B. (1998) Getting started with the SUSAS: Speech under simulated and actual stress database.
  15. Huang, X. (2001) Spoken Language Processing, Prentice Hall.
  16. Huang, X., Acero, A. & Hon, H.-W. (2001) Spoken Language Processing: A guide to theory, algortihm and system development, New Jersey, Prentice Hall.
  17. Kumar, N. & Andreou, A. G. (1998) Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition. Speech Communication, 26, 283-297.
  18. Montero, J. M., Gutirrez-Arriola, J., Palazuelos, S., Enriquez, E. & Pardo, J. M. (1998) Spanish emotional speech from database to TTS. ICSLP. Sydney.
  19. Morales-Perez, M., Echeverry-Correa, J., OrozcoGutierrez, A. & Castellanos-Dominguez, G. (2008) Feature Extraction of speech signals in emotion identification. IEEE 2008 International Conference of the Engineering in Medicine and Biology Society (EMBS 7808). Vancouver, Canada.
  20. Rabiner, L. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of the IEEE, 77, 257-285.
  21. Ronzhin, A. L., Lee, I. V., Karpov, A. A. & Skormin, V. A. (2004) Automatic estimation of human's psychophysiological state by speech. 9th Conference of Speech and Computer (SPECOM 7804). St. Petersburg, Russia.
  22. Steinhardt, B. (2000) Face-off: Is the use of biometrics an invasion of privacy? Network World. 05/08/00 ed., Network World Inc.
  23. Vaclave Jr., M. & Riha, Z. (2000) Biometric authentication systems., ECOM-MONITOR.
  24. Young, S. J., Jansen, J., Ordell, J. J., Ollason, D. & Woodland, P. C. (1995) The HTK Hidden Markov Model Toolkit Book. Entropic Cambridge Research Laboratory.
Download


Paper Citation


in Harvard Style

L. Lutfi S., M. Montero J., Barra Chicote R., Lucas-Cuesta J. and Gallardo-Antolín A. (2009). EXPRESSIVE SPEECH IDENTIFICATIONS BASED ON HIDDEN MARKOV MODEL . In Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2009) ISBN 978-989-8111-63-0, pages 488-494. DOI: 10.5220/0001556704880494


in Bibtex Style

@conference{healthinf09,
author={Syaheerah L. Lutfi and J. M. Montero and R. Barra Chicote and J. M. Lucas-Cuesta and A. Gallardo-Antolín},
title={EXPRESSIVE SPEECH IDENTIFICATIONS BASED ON HIDDEN MARKOV MODEL},
booktitle={Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2009)},
year={2009},
pages={488-494},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001556704880494},
isbn={978-989-8111-63-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Health Informatics - Volume 1: HEALTHINF, (BIOSTEC 2009)
TI - EXPRESSIVE SPEECH IDENTIFICATIONS BASED ON HIDDEN MARKOV MODEL
SN - 978-989-8111-63-0
AU - L. Lutfi S.
AU - M. Montero J.
AU - Barra Chicote R.
AU - Lucas-Cuesta J.
AU - Gallardo-Antolín A.
PY - 2009
SP - 488
EP - 494
DO - 10.5220/0001556704880494