SPEECH EMOTIONAL FEATURES MEASURED BY POWER-LAW DISTRIBUTION BASED ON ELECTROGLOTTOGRAPHY

Lijiang Chen, Xia Mao, Yuli Xue, Mitsuru Ishizuka

2012

Abstract

This study was designed to introduce a kind of novel speech emotional features extracted from Electroglottography (EGG). These features were obtained from the power-law distribution coefficient (PLDC) of fundamental frequency (F0) and duration parameters. First, the segments of silence, voiced and unvoiced (SUV) were distinguished by combining the EGG and speech information. Second, the F0 of voiced segment and the first-order differential of F0 was obtained by a cepstrum method. Third, PLDC of voiced segment as well as the pitch rise and pitch down duration were calculated. Simulation results show that the proposed features are closely connected with emotions. Experiments based on Support Vector Machine (SVM) are carried out. The results show that proposed features are better than those commonly used in the case of speaker independent emotion recognition.

References

  1. Borchert, M. and Dusterhoft, A. (2005). Emotions in speech-experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE'05. Proceedings of 2005 IEEE International Conference on, pages 147-151. IEEE.
  2. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3):273-297.
  3. Cowie, R. and Cornelius, R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1-2):5-32.
  4. Kania, R., Hartl, D., Hans, S., Maeda, S., Vaissiere, J., and Brasnu, D. (2006). Fundamental frequency histograms measured by electroglottography during speech: a pilot study for standardization. Journal of Voice, 20(1):18-24.
  5. Lin, Y. and Wei, G. (2005). Speech emotion recognition based on HMM and SVM. In Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on, volume 8, pages 4898-4901. IEEE.
  6. Noll, A. (1967). Cepstrum pitch determination. The journal of the acoustical society of America, 41:293.
  7. Pudil, P., Novovicová, J., and Kittler, J. (1994). Floating search methods in feature selection. Pattern recognition letters, 15(11):1119-1125.
  8. Schuller, B., Reiter, S., Muller, R., Al-Hames, M., Lang, M., and Rigoll, G. (2005). Speaker independent speech emotion recognition by ensemble classification. In Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, pages 864-867. IEEE.
  9. Schuller, B., Rigoll, G., and Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP'04). IEEE International Conference on, volume 1, pages I577-I580. IEEE.
  10. Shami, M. and Kamel, M. (2005). Segment-based approach to the recognition of emotions in speech. In 2005 IEEE International Conference on Multimedia and Expo, pages 1-4. IEEE.
  11. Ververidis, D. and C., K. (2006). Emotional speech recognition-resources features and methods. Speech Communication, 48:1162-1181.
  12. Yang, B. and Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing, 90(5):1415-1423.
  13. Zhao, L., Cao, Y., Wang, Z., and Zou, C. (2005). Speech emotional recognition using global and time sequence structure features with mmd. Affective Computing and Intelligent Interaction, pages 311-318.
Download


Paper Citation


in Harvard Style

Chen L., Mao X., Xue Y. and Ishizuka M. (2012). SPEECH EMOTIONAL FEATURES MEASURED BY POWER-LAW DISTRIBUTION BASED ON ELECTROGLOTTOGRAPHY . In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2012) ISBN 978-989-8425-89-8, pages 131-136. DOI: 10.5220/0003886301310136


in Bibtex Style

@conference{biosignals12,
author={Lijiang Chen and Xia Mao and Yuli Xue and Mitsuru Ishizuka},
title={SPEECH EMOTIONAL FEATURES MEASURED BY POWER-LAW DISTRIBUTION BASED ON ELECTROGLOTTOGRAPHY},
booktitle={Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2012)},
year={2012},
pages={131-136},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003886301310136},
isbn={978-989-8425-89-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2012)
TI - SPEECH EMOTIONAL FEATURES MEASURED BY POWER-LAW DISTRIBUTION BASED ON ELECTROGLOTTOGRAPHY
SN - 978-989-8425-89-8
AU - Chen L.
AU - Mao X.
AU - Xue Y.
AU - Ishizuka M.
PY - 2012
SP - 131
EP - 136
DO - 10.5220/0003886301310136