ON SPEECH RECOGNITION PERFORMANCE UNDER NON-STATIONARY ECHO CANCELLATION

Mahdi Triki

2011

Abstract

During the last decades, performance of speech recognizers significantly increased for large vocabulary tasks and adverse environments. To reduce interference, acoustic echo cancellation has been proposed and extensively investigated. Particular attention was paid to the convergence proprieties and the capability to handle double talk. However, in time-varying environment, the echo canceller has the additional task to track the variations of the propagation channel. With this respect, it has been established that algorithms that exhibit fast convergence do not provide necessarily good tracking performances. In such an environment, performance assessment is also challenging and the ‘experiment’ design is crucial to provide consistent and interpretable results. In the present paper, we reproduce time-varying artifacts by altering the surrounding acoustic environment (using a moving person/robot). The movement characteristics (discrete/continuous) and location (line-of-sight/background) emphasizes different room/algorithms characteristics and provides deeper insights on the system behavior.

References

  1. Ephraim, Y. and Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. on Acoustic, Speech and Signal Processing.
  2. Etter, W. and Moschytz, G. (1994). Noise reduction by noise-adaptive spectral magnitude expansion. Journal of the Audio Engineering Society.
  3. Habets, E. (2007). Single- and Multi-Microphone Speech Dereverberation using Spectral Enhancement. PhD thesis, Technische Universiteit Eindhoven.
  4. Haykin, S. (2002). Adaptive Filter Theory. Prentice Hall.
  5. J. Benesty, T. Gansler, D. M. M. S. and Gay, S. (2001). Advances in Network and Acoustic Echo Cancellation. Springer.
  6. J. Picone, M. J. and Hartwell, W. (1988). Enhancing the performance of speech recognition with echo cancellation. In IEEE Int. Conf. Acoustic, Speech, and Signal Processing (ICASSP).
  7. M. Berouti, R. S. and Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In IEEE Int. Conf. Acoustic, Speech, and Signal Processing (ICASSP), volume 4, pages 208-211.
  8. Shynk, J. (1992). Frequency-domain and multirate adaptive filtering. IEEE Signal Processing Magazine.
  9. Tashev, I. (2006). Defeating Ambient Noise: Practical Approaches for Noise Reduction and Suppression. Tutorial at ICASSP.
  10. X. Huang, A. A. and Hon, H. (2001). Spoken Language Processing. Carnegie Mellon University.
Download


Paper Citation


in Harvard Style

Triki M. (2011). ON SPEECH RECOGNITION PERFORMANCE UNDER NON-STATIONARY ECHO CANCELLATION . In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2011) ISBN 978-989-8425-35-5, pages 316-321. DOI: 10.5220/0003182903160321


in Bibtex Style

@conference{biosignals11,
author={Mahdi Triki},
title={ON SPEECH RECOGNITION PERFORMANCE UNDER NON-STATIONARY ECHO CANCELLATION},
booktitle={Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2011)},
year={2011},
pages={316-321},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003182903160321},
isbn={978-989-8425-35-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bio-inspired Systems and Signal Processing - Volume 1: BIOSIGNALS, (BIOSTEC 2011)
TI - ON SPEECH RECOGNITION PERFORMANCE UNDER NON-STATIONARY ECHO CANCELLATION
SN - 978-989-8425-35-5
AU - Triki M.
PY - 2011
SP - 316
EP - 321
DO - 10.5220/0003182903160321