PATHOLOGICAL VOICE DETECTION USING TURBULENT SPEECH SEGMENTS

Fernando Perdigão, Cláudio Neves, Luís Sá

2012

Abstract

Identification of voice pathologies using only the voice signal has a great advantage over the conventional methods, such as laryngoscopy, since they enable a non-invasive diagnosis. The first studies in this area were based on the analysis of sustained vowel sounds. More recently, there are studies that extend the analysis to continuous speech, achieving similar or better results. All these studies use of a pitch detector algorithm to select only the voiced parts of the acoustic signal. However, the existence of a pathology affecting the speaker’s vocal folds produces a more irregular vibration pattern and, consequently, a degradation of the voice quality with less voiced segments. Thus, by selecting only clear voiced segments for the classifier, useful pathological information may be disregarded. In this study we propose a new approach that enables the classification of voice pathology by also analyzing the unvoiced information of continuous speech. The signal frames are divided in turbulent/non-turbulent, instead of voice/non-voiced. The results show that useful information is indeed present in turbulent or near unvoiced segments. A comparison with systems that use the entire signal or only the non-turbulent frames shows that the unvoiced or highly turbulent speech segments contain useful pathological information.

References

  1. Deliyski, D., 1993. Acoustic model and evaluation of pathological voice production, in: 3rd Conference on Speech Communication and Technology.
  2. Krom, G., 1993. A cepstrum-based technique for determining a harmonics-to-noise ratio in speech signals, J. Speech Hear. Res. 36 (1993) 254-266.
  3. Hillenbrand, J., Houde, R-, 1996. Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech, J. Speech Hear. Res. 39.
  4. Michaelis, D., Gramss, T., 1997. H.W. Strube, Glottal-tonoise excitation ratio - a new measure for describing pathological voices, Acta Acustica 83 (1997) 700-706.
  5. Kasuya, H., Ogawa, S., Mashima, K., Ebihara, S, 1986.. Normalized noise energy as an acoustic measure to evaluate pathologic voice, J. Acoust. Soc. Am. 80 (5) (1986) 1329-1334.
  6. Klingholtz, F., 1990. Acoustic recognition of voice disorders: a comparative study of running speech versus sustained vowels, J. Acoust. Soc. Am. 87.
  7. de Krom, G., 1995. Some spectral correlates of pathological breathy and rough voice quality for different types of vowel fragments. J. Speech Hear. Res. 38.
  8. Godino-Llorente, J., Fraile, R., Sáenz-Lechón, N., OsmaRuiz, V., Gómez-Vilda, P., 2009. Automatic detection of voice impairments from text-dependent running speech, J. Biomed. Signal Process. Control 4.
  9. Mitev, P. ,Hadjitodorov, S., 2000. A method for turbulent noise estimation in voiced signals, J. Med. Biol. Eng. Comput., 38, 625-631.
  10. DVD 1994. Massachusetts Eye and Ear Infirmary Voice and Speech Lab, Disordered Voice Database version 1.03, Kay Elemetrics Corp., Pine Brook, NJ.
  11. Boersma, P., 1993. Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, Proceedings of the Institute of Phonetic Sciences 17, 97-110.
  12. Martin, A. et al., 1997. The DET curve in assessment of detection task performance, in: 5th European Conference on Speech Communication and technology - EuroSpeech 1997, 1895-1898.
Download


Paper Citation


in Harvard Style

Perdigão F., Neves C. and Sá L. (2012). PATHOLOGICAL VOICE DETECTION USING TURBULENT SPEECH SEGMENTS . In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2012) ISBN 978-989-8425-89-8, pages 238-243. DOI: 10.5220/0003775902380243


in Bibtex Style

@conference{biosignals12,
author={Fernando Perdigão and Cláudio Neves and Luís Sá},
title={PATHOLOGICAL VOICE DETECTION USING TURBULENT SPEECH SEGMENTS},
booktitle={Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2012)},
year={2012},
pages={238-243},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003775902380243},
isbn={978-989-8425-89-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2012)
TI - PATHOLOGICAL VOICE DETECTION USING TURBULENT SPEECH SEGMENTS
SN - 978-989-8425-89-8
AU - Perdigão F.
AU - Neves C.
AU - Sá L.
PY - 2012
SP - 238
EP - 243
DO - 10.5220/0003775902380243