Objective Assessment of Asthenia using Energy and Low-to-High Spectral Ratio

Farideh Jalalinajafabadi, Chaitaniya Gadepalli, Mohsen Ghasempour, Frances Ascott, Mikel Luján, Jarrod Homer, Barry Cheetham

2015

Abstract

Vocal cord vibration is the source of voiced phonemes. Voice quality depends on the nature of this vibration. Vocal cords can be damaged by infection, neck or chest injury, tumours and more serious diseases such as laryngeal cancer. This kind of physical harm can cause loss of voice quality. Voice quality assessment is required from Speech and Language Therapists (SLTs). SLTs use a well-known subjective assessment approach which is called GRBAS. GRBAS is an acronym for a five dimensional scale of measurements of voice properties which were originally recommended by the Japanese Society of Logopeadics and Phoniatrics and the European Research for clinical and research use. The properties are ‘Grade’, ‘Roughness’, ‘Breathiness’, ‘Asthenia’ and ‘Strain’. The objective assessment of the G, R, B and S properties has been well researched and can be carried out by commercial measurement equipment. However, the assessment of Asthenia has been less extensively researched. This paper concerns the objective assessment of ‘Asthenia’ using features extracted from 20 ms frames of sustained vowel /a/. We develop two regression prediction models to objectively estimate Asthenia against speech and language therapists (SLTs) scores. These regression models are ‘K nearest neighbor regression’ (KNNR) and ‘Multiple linear regression’(MLR). These new approaches for prediction of Asthenia are based on different subsets of features, different sets of data and different prediction models in comparison with previous approaches in the literature. The performance of the system has been evaluated using Normalised Root Mean Square Error (NRMSE) for each of 20 trials, taking as a reference the average score for each subject selected. The subsets of features that generate the lowest NRMSE are determined and used to evaluate the two regression models. The objective system was compared with the scoring of each individual SLT and was found to have a NRMSE, averaged over 20 trials, lower than two of them and only slightly higher than the third.

References

  1. Awan, S. N. and Roy, N. (2006). Toward the development of an objective index of dysphonia severity: a fourfactor acoustic model. Clinical linguistics & phonetics, 20(1):35-49.
  2. Bergstra, J. and Bengio, Y. (2012). Random search for hyper-parameter optimization. The Journal of Machine Learning Research, 13(1):281-305.
  3. Berry, W. D. and Feldman, S. (1985). Multiple regression in practice. Number 50. Sage.
  4. Bhuta, T., Patrick, L., and Garnett, J. D. (2004). Perceptual evaluation of voice quality and its correlation with acoustic measurements. Journal of Voice, 18(3):299- 304.
  5. Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological bulletin, 70(4):213.
  6. Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. The Journal of Machine Learning Research, 3:1157-1182.
  7. Hirano, M. (1981). Clinical examination of voice, volume 5. Springer New York.
  8. Jalalinajafabadi, F., Gadepalli, C., Ascott, F., Homer, J., Luján, M., and Cheetham, B. (2013). Perceptual evaluation of voice quality and its correlation with acoustic measurement. In Modelling Symposium (EMS), 2013 European, pages 283-286. IEEE.
  9. Jiangsheng, Y. (2002). Method of k-nearest neighbors. Institute of Computational Linguistics, Peking University, China, 100871.
  10. KayPENTAX (2008). A Division of PENTAX medical Company. http://www.kaypentax.com. [Accessed 19- March-2015].
  11. Kempster, G. B., Gerratt, B. R., Abbott, K. V., BarkmeierKraemer, J., and Hillman, R. E. (2009). Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. American Journal of Speech-Language Pathology, 18(2):124-132.
  12. Kohavi, R. et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI, volume 14, pages 1137-1145.
  13. Kohavi, R. and John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1):273- 324.
  14. Langley, P. et al. (1994). Selection of relevant features in machine learning. Defense Technical Information Center.
  15. Sheskin, D. J. (2003). Handbook of parametric and nonparametric statistical procedures. crc Press.
  16. Streiner, D. L. (1995). Learning how to differ: agreement and reliability statistics in psychiatry. The Canadian Journal of Psychiatry/La Revue canadienne de psychiatrie.
  17. Viera, A. J., Garrett, J. M., et al. (2005). Understanding interobserver agreement: the kappa statistic. Fam Med, 37(5):360-363.
  18. Villa-Canas, T., Orozco-Arroyave, J., Arias-Londono, J., Vargas-Bonilla, J., and Godino-Llorente, J. (2013). Automatic assessment of voice signals according to the grbas scale using modulation spectra, mel frequency cepstral coefficients and noise parameters. In Image, Signal Processing, and Artificial Vision (STSIVA), 2013 XVIII Symposium of, pages 1-5. IEEE.
  19. Wuyts, F. L., De Bodt, M. S., Molenberghs, G., Remacle, M., Heylen, L., Millet, B., Van Lierde, K., Raes, J., and Van de Heyning, P. H. (2000). The dysphonia severity indexan objective measure of vocal quality based on a multiparameter approach. Journal of Speech, Language, and Hearing Research, 43(3):796- 809.
  20. Yu, P., Garrel, R., Nicollas, R., Ouaknine, M., and Giovanni, A. (2006). Objective voice analysis in dysphonic patients: new data including nonlinear measurements. Folia Phoniatrica et Logopaedica, 59(1):20- 30.
  21. Yuan, H., Tseng, S.-S., Gangshan, W., and Fuyan, Z. (1999). A two-phase feature selection method using both filter and wrapper. In Systems, Man, and Cybernetics, 1999. IEEE SMC'99 Conference Proceedings.
Download


Paper Citation


in Harvard Style

Jalalinajafabadi F., Gadepalli C., Ghasempour M., Ascott F., Luján M., Homer J. and Cheetham B. (2015). Objective Assessment of Asthenia using Energy and Low-to-High Spectral Ratio . In Proceedings of the 12th International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2015) ISBN 978-989-758-118-2, pages 76-83. DOI: 10.5220/0005545000760083


in Bibtex Style

@conference{sigmap15,
author={Farideh Jalalinajafabadi and Chaitaniya Gadepalli and Mohsen Ghasempour and Frances Ascott and Mikel Luján and Jarrod Homer and Barry Cheetham},
title={Objective Assessment of Asthenia using Energy and Low-to-High Spectral Ratio},
booktitle={Proceedings of the 12th International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2015)},
year={2015},
pages={76-83},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005545000760083},
isbn={978-989-758-118-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2015)
TI - Objective Assessment of Asthenia using Energy and Low-to-High Spectral Ratio
SN - 978-989-758-118-2
AU - Jalalinajafabadi F.
AU - Gadepalli C.
AU - Ghasempour M.
AU - Ascott F.
AU - Luján M.
AU - Homer J.
AU - Cheetham B.
PY - 2015
SP - 76
EP - 83
DO - 10.5220/0005545000760083