Authors:
Leandro D. Vignolo
1
;
Hugo L. Rufiner
2
;
Diego H. Milone
2
and
John C. Goddard
3
Affiliations:
1
Grupo de Investigación en Señales e Inteligencia Computacional Departamento de Informática, Facultad de Ingeniería y Ciencias Hídricas, Argentina
;
2
Universidad Nacional del Litoral, Argentina
;
3
Universidad Autónoma Metropolitana, Mexico
Keyword(s):
Automatic speech recognition, Evolutionary computation, Phoneme classification, Cepstral coefficients.
Related
Ontology
Subjects/Areas/Topics:
Acoustic Signal Processing
;
Artificial Intelligence
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Computational Intelligence
;
Data Manipulation
;
Evolutionary Systems
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Methodologies and Methods
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Soft Computing
;
Speech Recognition
Abstract:
Some of the most commonly used speech representations, such as mel-frequency cepstral coefficients, incorporate biologically inspired characteristics into artificial systems. Recent advances have been introduced modifying the shape and distribution of the traditional perceptually scaled filterbank, commonly used for feature extraction. Some alternatives to the classic mel scaled filterbank have been proposed, improving the phoneme recognition performance in adverse conditions. In this work we propose an evolutionary strategy as a way to find an optimal filterbank. Filter parameters such as the central and side frequencies are optimized. A hidden Markov model classifier is used for the evaluation of the fitness for each possible solution. Experiments where conducted using a set of phonemes taken from the TIMIT database with different additive noise levels. Classification results show that the method accomplishes the task of finding an optimized filterbank for phoneme recognition.