Authors:
Farideh Jalalinajafabadi
1
;
Chaitaniya Gadepalli
2
;
Mohsen Ghasempour
1
;
Frances Ascott
2
;
Mikel Luján
1
;
Jarrod Homer
2
and
Barry Cheetham
1
Affiliations:
1
University of Manchester, United Kingdom
;
2
Central Manchester University Hospitals Foundation Trust, United Kingdom
Keyword(s):
GRBAS, Asthenia, MLR, KNNR.
Related
Ontology
Subjects/Areas/Topics:
Audio and Video Quality Assessment
;
Biomedical Applications
;
Multimedia
;
Multimedia Systems and Applications
;
Telecommunications
Abstract:
Vocal cord vibration is the source of voiced phonemes. Voice quality depends on the nature of this vibration.
Vocal cords can be damaged by infection, neck or chest injury, tumours and more serious diseases such as
laryngeal cancer. This kind of physical harm can cause loss of voice quality. Voice quality assessment is
required from Speech and Language Therapists (SLTs). SLTs use a well-known subjective assessment approach
which is called GRBAS. GRBAS is an acronym for a five dimensional scale of measurements of voice
properties which were originally recommended by the Japanese Society of Logopeadics and Phoniatrics and
the European Research for clinical and research use. The properties are ‘Grade’, ‘Roughness’, ‘Breathiness’,
‘Asthenia’ and ‘Strain’. The objective assessment of the G, R, B and S properties has been well researched and
can be carried out by commercial measurement equipment. However, the assessment of Asthenia has been less
extensively researched. This paper concerns
the objective assessment of ‘Asthenia’ using features extracted
from 20 ms frames of sustained vowel /a/. We develop two regression prediction models to objectively estimate
Asthenia against speech and language therapists (SLTs) scores. These regression models are ‘K nearest
neighbor regression’ (KNNR) and ‘Multiple linear regression’(MLR). These new approaches for prediction
of Asthenia are based on different subsets of features, different sets of data and different prediction models
in comparison with previous approaches in the literature. The performance of the system has been evaluated
using Normalised Root Mean Square Error (NRMSE) for each of 20 trials, taking as a reference the average
score for each subject selected. The subsets of features that generate the lowest NRMSE are determined and
used to evaluate the two regression models. The objective system was compared with the scoring of each
individual SLT and was found to have a NRMSE, averaged over 20 trials, lower than two of them and only
slightly higher than the third.
(More)