Authors:
Agustín Álvarez
1
;
Andrés Gómez
2
;
Daniel Palacios
3
;
Jiri Mekyska
4
;
Athanasios Tsanas
2
;
Pedro Gómez
1
and
Rafael Martínez
1
Affiliations:
1
Neuromorphic Speech Processing Lab, Center for Biomedical Technology, Universidad Politécnica de Madrid, Campus de Montegancedo, 28223 Pozuelo de Alarcón, Madrid, Spain
;
2
Usher Institute, The University of Edinburgh, Edinburgh Bioquarter, 9 Little France Road, Edinburgh, EH16 4UX, U.K.
;
3
Escuela Técnica Superior de Ingeniería Informática, Universidad Rey Juan Carlos, Calle Tulipán, s/n, 28933 Móstoles, Madrid, Spain
;
4
Department of Telecommunications, Brno University of Technology, Technicka 10, 61600 Brno, Czech Republic
Keyword(s):
Neuromotor Disease Phonation, Glottal Signature, Parkinson’s Disease, Aging Voice.
Abstract:
The study of speech and voice in people diagnosed with a neurodegenerative disorder for the purposes of detection and monitoring has known a very relevant push forward in these last years, but it is far from being completed. One of the main concerns nowadays is that once the deterioration of speech and phonation quality has been informed by machine learning relying upon clinical expertise, there is insufficient evidence to resolve if quality deterioration may come from organic causes, neuromotor degeneration or simply from aging. The present work is part of a more ambitious plan to shed light on this problem by resorting to a theoretical modelling of glottal signals under the main known causes affecting phonation quality, which are closure deficits during the phonation cycle. These deficits may be due to anatomical, organic pathologic or neuromotor reasons. Simulation examples explaining them in the glottal excitation signals are given and contrasted with real examples. Finally, rele
vant scores from an experimental separation of Parkinson Disease phonation samples from 24 male and 24 female subjects against aging 24 male and 24 female controls on the same age taken from a male-female balanced dataset confronted to a normative subset of 24 male and 24 female speakers are presented to exemplify an analysis study deepening into this problem. Although classification accuracy scores as high as 99.69 and 99.59 were attained in 10-fold cross-validation using an SVM classifier, there is still the impression that co-morbidity and aging effects are not well taken into account, requiring a further semantic study on the features behind the discrimination scores obtained.
(More)