Authors:
HariKrishna Maganti
1
and
Marco Matassoni
2
Affiliations:
1
Fondazione Bruno Kessler, Italy
;
2
IRST, Italy
Keyword(s):
Bio-inspired auditory processing, Gammatone filtering, Modulation spectrum, Reverberation, Automatic speech recognition.
Related
Ontology
Subjects/Areas/Topics:
Acoustic Signal Processing
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Speech Recognition
Abstract:
Mel-frequency cepstrum based features have been traditionally used for speech recognition in a number of applications, as they naturally provide a higher recognition accuracies. However, these features are not very robust in a noisy acoustic conditions. In this article, we investigate the use of bio-inspired auditory features emulating the processing performed by cochlea to improve the robustness, particularly to counter environmental reverberation. Our methodology first extracts robust noise resistant features by gammatone filtering, which emulate cochlea frequency resolution and then a long-term modulation spectral processing is performed which preserves speech intelligibility in the signal. We compare and discuss the features based upon the performance on Aurora5 meeting recorder digit task recorded with four different microphones in a hands-free mode at a real meeting room. The experimental results show that the proposed features provide considerable improvements with respect to
the state of the art feature extraction techniques.
(More)