Authors:
Osman Büyük
1
and
Levent M. Arslan
2
Affiliations:
1
Kocaeli University, Turkey
;
2
Bogazici University and Sestek Inc., Turkey
Keyword(s):
Speaker Verification, Text-Dependent, Single Utterance, Sentence HMM, Prosodic Features.
Related
Ontology
Subjects/Areas/Topics:
Acoustic Signal Processing
;
Applications
;
Artificial Intelligence
;
Biomedical Engineering
;
Biomedical Signal Processing
;
Biometrics
;
Biometrics and Pattern Recognition
;
Computational Intelligence
;
Data Manipulation
;
Health Engineering and Technology Applications
;
Human-Computer Interaction
;
Methodologies and Methods
;
Multimedia
;
Multimedia Signal Processing
;
Neural Networks
;
Neurocomputing
;
Neurotechnology, Electronics and Informatics
;
Pattern Recognition
;
Physiological Computing Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Speech Recognition
;
Telecommunications
;
Theory and Methods
Abstract:
In this paper, we combine spectral and prosodic features together in order to improve the verification performance on a text-dependent single utterance speaker verification task. The baseline spectral system makes use of a whole-phrase sentence HMM topology for the fixed utterance. We extract prosodic features using time alignment information obtained from the HMM states. In our experiments we observe that, although the prosodic features individually do not yield high performance, they provide complementary information to the spectral features. We achieve approximately 10% relative reduction in EER when the information sources are combined with a multi-layer neural network.