PITCH-ASYNCHRONOUS GLOTTAL INVERSE FILTERING OF NORMAL AND PATHOLOGICAL VOICES BASED ON HOMOMORPHIC PREDICTION
Rubén Fraile, Malte Kob, Juana M. Gutierrez, Nicolás Sáenz-Lechón, Juan Ignacio Godino-Llorente, Víctor Osma-Ruiz
2010
Abstract
Inverse filtering of speech signals for the separation of vocal tract and glottal source effects has a wide variety of potential applications, including the assessment of glottis-related aspects of voice function. Among all existing approaches to inverse filtering, this paper focuses on homomorphic prediction. While not favoured much by researchers in recent literature, such an approach offers two advantages over others: it does not require previous estimation of the fundamental frequency and it does not rely on any assumptions about the spectral enevelope of the glottal signal. The performance of homomorphic prediction is herein assessed and compared to that of an adaptive inverse filtering method making use of synthetic voices produced with a biomechanical voice production model. The reported results indicate that the performance of inverse filtering based on homomorphic prediction is within the range of that of adaptive inverse filtering and, at the same time, it has a better behaviour when the spectral envelope of the glottal signal does not suit an all-pole model of predefined order.
References
- Akande, O. O. and Murphy, P. J. (2005). Estimation of the vocal tract transfer function with application to glottal wave analysis. Speech Communication, 46(1):15 - 36.
- Alku, P. (1992). An automatic method to estimate the timebased parameters of the glottal pulseform. In IEEE International Conference on Acoustics, Speech and Signal Processing, volume 2, pages 29-32.
- Arias, M. and Bäckström, T. (2008). TKK aparat. http://aparat.sourceforge.net (visited May 2009).
- Childers, D., Skinner, D., and Kemerait, R. (1977). The cepstrum: A guide to processing. Proceedings of the IEEE, 65(10):1428-1443.
- de Oliveira-Rosa, M., Pereira, J., and Grellet, M. (2000). Adaptive estimation of residue signal for voice pathology diagnosis. IEEE Transactions on Biomedical Engineering, 47(1):96-104.
- El-Jaroudi, A. and Makhoul, J. (1991). Discrete all-pole modeling. IEEE Transactions on signal processing, 39(2):411-423.
- Fu, Q. and Murphy, P. (2006). Robust glottal source estimation based on joint source-filter model optimization. IEEE Transactions on Audio, Speech and Language Processing, 14(2):492-501.
- Gómez-Vilda, P., Fernández-Baillo, R., Nieto, A., Díaz, F., Fernández-Camacho, F. J., Rodellar, V., Í lvarez, A., and Martínez, R. (2007). Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. Journal of Voice, 21(4):450 - 476.
- Gómez-Vilda, P., Fernández-Baillo, R., Rodellar-Biarge, V., Nieto-Lluis, V., Ílvarez-Marquina, A., MazairaFernández, L. M., Martínez-Olalla, R., and GodinoLlorente, J. I. (2008). Glottal source biometrical signature for voice pathology detection. Speech Communication, In Press.
- Kob, M. (2002a). Physical Modeling of the Singing Voice. PhD thesis, Fakulät für Elektrotechnik und Informationstechnik - RWTH Aachen. Logos-Verlag.
- Kob, M. (2002b). Vox - a time-domain model for the singing voice. http://www.akustik.rwthaachen.de/˜malte/vox/index.html.en (visited May 2009). Computer software.
- Kob, M., Alhuser, N., and Reiter, U. (1999). Time-domain model of the singing voice. In Proceedings of the 2nd COST G-6 Workshop on Digital Audio Effects, Trodheim (Norway).
- Kopec, G., Oppenheim, A., and Tribolet, J. (1977). Speech analysis homomorphic prediction. IEEE Transactions on Acoustics, Speech and Signal Processing, 25(1):40-49.
- Mathur, S., Story, B. H., and Rodriguez, J. J. (2006). Vocal-tract modeling: fractional elongation of segment lengths in a waveguide model with half-sample delays. IEEE Transactions on Audio Speech and Language Processing, 14(5):1754-1762.
- Moore, E. and Torres, J. (2008). A performance assessment of objective measures for evaluating the quality of glottal waveform estimates. Speech Communication, 50(1):56-66.
- Oppenheim, A. and Schafer, R. W. (1968). Homomorphic analysis of speech. IEEE Transactions on Audio and Electroacoustics, 16(2):221-226.
- Rabiner, L. R. and Schafer, R. W. (1978). Digital processing of speech signals. Prentice-Hall.
- Rahman, M. S. and Shimamura, T. (2005). Formant frequency estimation of high-pitched speech by homomorphic prediction. Acoustical science and technology, 26(6):502-510.
- Sapienza, C. and Hoffman-Ruddy, B. (2009). Voice Disorders. Plural Publishing.
- Story, B. H. and Titze, I. R. (1998). Parameterization of vocal tract area functions by empirical orthogonal modes. Journal of Phonetics, 26(3):223-260.
- Walker, J. and Murphy, P. (2007). A review of glottal waveform analysis. In Stylianou, Y., Faundez-Zanuy, M., and Esposito, A., editors, Progress in Nonlinear Speech Processing, volume 4391/2007 of Lecture Notes in Computer Science, pages 1-21. SpringerVerlag.
- Wong, D., Markel, J., and Jr., A. G. (1979). Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Transactions Acoustics, Speech and Signal Processing, 27(4):350-355.
Paper Citation
in Harvard Style
Fraile R., Kob M., M. Gutierrez J., Sáenz-Lechón N., Ignacio Godino-Llorente J. and Osma-Ruiz V. (2010). PITCH-ASYNCHRONOUS GLOTTAL INVERSE FILTERING OF NORMAL AND PATHOLOGICAL VOICES BASED ON HOMOMORPHIC PREDICTION . In Proceedings of the Third International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2010) ISBN 978-989-674-018-4, pages 45-52. DOI: 10.5220/0002699300450052
in Bibtex Style
@conference{biosignals10,
author={Rubén Fraile and Malte Kob and Juana M. Gutierrez and Nicolás Sáenz-Lechón and Juan Ignacio Godino-Llorente and Víctor Osma-Ruiz},
title={PITCH-ASYNCHRONOUS GLOTTAL INVERSE FILTERING OF NORMAL AND PATHOLOGICAL VOICES BASED ON HOMOMORPHIC PREDICTION},
booktitle={Proceedings of the Third International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2010)},
year={2010},
pages={45-52},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002699300450052},
isbn={978-989-674-018-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Third International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2010)
TI - PITCH-ASYNCHRONOUS GLOTTAL INVERSE FILTERING OF NORMAL AND PATHOLOGICAL VOICES BASED ON HOMOMORPHIC PREDICTION
SN - 978-989-674-018-4
AU - Fraile R.
AU - Kob M.
AU - M. Gutierrez J.
AU - Sáenz-Lechón N.
AU - Ignacio Godino-Llorente J.
AU - Osma-Ruiz V.
PY - 2010
SP - 45
EP - 52
DO - 10.5220/0002699300450052