ACKNOWLEDGEMENTS
This work was supported in part by the Turkish State
Planning Organization (DPT) under the TAM
Project, number 2007K120610. This work was also
supported in part by the TUBITAK TEYDEB 1509
under the Electronic Doctor’s Round Project,
number 9090036.
REFERENCES
Auckenthaler, R., Carey, M., Lloyd-Thomas, H., 2000.
“Score normalization for text-independent speaker
verification systems,” Digital Signal Processing 10 (1-
3), pp. 42-54.
Charlet, D., Jouvet, D., Collin, O., 2000. “An alternative
normalization scheme in HMM-based text-dependent
speaker verification,” Speech Communication 31 (2-
3), pp. 113-120.
Dehak, N., Dumouchel, P., Kenny, P., 2007. “Modeling
prosodic features with joint factor analysis for speaker
verification,” IEEE Transactions on Audio, Speech
and Language Processing 15 (7), pp. 2095-2103.
Ferrer, L., Scheffer, N., Shriberg, E., 2010. “A comparison
of approaches for modeling prosodic features in
speaker recognition,” International Conference on
Acoustics, Speech, and Signal Processing (ICASSP
2010).
Klusacek, D., Navratil, J., Reynolds, D., Campbell, J.,
2003. “Conditional pronunciation modeling in speaker
detection,” International Conference on Acoustics,
Speech, and Signal Processing (ICASSP 2003).
NIST, 2012. “National Institute of Standards and
Technology. Speaker Recognition Evaluation,”
http://www.nist.gov/speech/tests/spk.
Reynolds, D., Andrews, W., Campbell, J., Navratil, J.,
Peskin, B., Adami, A., Jin, Q., Klusacek, D.,
Abramson, J., Mihaescu, R., Godfrey, J., Jones, D.,
Xiang, B., 2003. “The SuperSID project: Exploiting
high-level information for high-accuracy speaker
recognition,” International Conference on Acoustics,
Speech, and Signal Processing (ICASSP 2003).
Shriberg, E., Ferrer, L., Kajarekar, S., Venkataraman, A.,
Stolcke, A., 2005. “Modeling prosodic feature
sequences for speaker recognition,” Speech
Communication 46 (3-4), pp. 455-472.
Talkin, D., 1995. “A robust algorithm for pitch tracking
(RAPT)”, Speech Coding and Synthesis edited by W.
B. Kleijn, K.K. Paliwal (Elsevier, New York), pp.
495–518.
Weber, F., Manganaro, L., Peskin, B., Shriberg, E., 2002.
”Using prosodic and lexical information for speaker
identification,” International Conference on Acoustics,
Speech, and Signal Processing (ICASSP 2002).
Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M.,
Gupta, C.S., 2005. “Combining evidence from source,
suprasegmental and spectral features for a fixed-text
speaker verification system,” IEEE Transactions on
Speech and Audio Processing 13 (4), pp. 575-582.
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw,
D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey,
D., Valtchev, V., Woodland, P., 2006. The HTK Book
(for HTK Version 3.4), Cambridge University
Engineering Department.
CombiningSpectralandProsodicFeaturesinHMM-basedSingleUtteranceSpeakerVerification
91