Experimental Evaluation of Probabilistic Similarity for Spoken Term Detection

Shi-wook Lee, Hiroaki Kojima, Kazuyo Tanaka, Yoshiaki Itoh

2013

Abstract

In this paper, the use of probabilistic similarity and the likelihood ratio for spoken term detection is investigated. The object of spoken term detection is to rank retrieved spoken terms according to their distance from a query. First, we evaluate several probabilistic similarity functions for use as a sophisticated distance. In particular, we investigate probabilistic similarity for Gaussian mixture models using the closed-form solutions and pseudo-sampling approximation of Kullback–Leibler divergence. And then we propose additive scoring factors based on the likelihood ratio of each individual subword. An experimental evaluation demonstrates that we can achieve an improved detection performance by using probabilistic similarity functions and applying the likelihood ratio.

References

  1. NIST, 2006. The Spoken Term Detection (STD) 2006 Evaluation Plan. From http://www.nist.gov/speech/ tests/std/docs/std06-evalplan-v10.pdf.
  2. Lee, S. W., Tanaka, K. and Itoh, Y., 2005. “Combining Multiple Subword Representations for Openvocabulary Spoken Document Retrieval”, In ICASSP'05, pp. 505-508.
  3. Kullback, S. and Leibler, R. A., 1951. “On Information and Sufficiency”, In The Annals of Mathematical Statistics, Vol. 22, No. 1, pp.79-86.
  4. Jeffreys, H., 1946. “An invariant form for the prior probability in estimation problem”, In Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences Vol. 186, No. 1007, pp. 453-461.
  5. Hershey, J. R. and Olsen, P. A., 2007. “Approximating the Kullback Leibler Divergence between Gaussian Mixture Models”, In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp.317-320.
  6. Bishop, C. M., 2006. “Pattern Recognition and Machine Learning”, Springer, pp.55-58, pp.85-87.
  7. Johnson, D. H. and Sinanovíc, S., 2001. “Symmetrizing the Kullback-Leibler distance,” In IEEE Trans. on Information Theory.
  8. Fukunaga, K., 1990. “Introduction to Statistical Pattern Recognition”, second ed., New York: Academic Press.
  9. Young, S., Evermann, G., et al., 2009. “The HTK Book (for HTK Version 3.4)”.
  10. Jiang, H., 2005. “Confidence Measures for Speech Recognition: A Survey”, In Speech Communication, Vol. 45, pp. 455-470.
  11. Maekawa, K., 2003. “Corpus of Spontaneous Japanese: Its Design and Evaluation”, In Proceedings of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR2003).
  12. sa94.1
  13. .xF94.0
  14. .o93.9
  15. Figure 2: Retrieval performance (Ave. of max. F-measure)
Download


Paper Citation


in Harvard Style

Lee S., Kojima H., Tanaka K. and Itoh Y. (2013). Experimental Evaluation of Probabilistic Similarity for Spoken Term Detection . In Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8565-41-9, pages 441-446. DOI: 10.5220/0004264304410446


in Bibtex Style

@conference{icpram13,
author={Shi-wook Lee and Hiroaki Kojima and Kazuyo Tanaka and Yoshiaki Itoh},
title={Experimental Evaluation of Probabilistic Similarity for Spoken Term Detection},
booktitle={Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2013},
pages={441-446},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004264304410446},
isbn={978-989-8565-41-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Experimental Evaluation of Probabilistic Similarity for Spoken Term Detection
SN - 978-989-8565-41-9
AU - Lee S.
AU - Kojima H.
AU - Tanaka K.
AU - Itoh Y.
PY - 2013
SP - 441
EP - 446
DO - 10.5220/0004264304410446