NN and Hybrid Strategies for Speech Recognition in Romanian Language

Corneliu-Octavian Dumitru, Inge Gavat

Abstract

In this paper we present results obtained with learning structures more “human likely” than the very effective and widely used hidden Markov model. Good results were obtained with simple artificial neural networks like the multilayer perceptron or the Kohonen maps. Hybrid structures have proven also their efficiency, the neuro-statistical hybrid applied enhancing the digit recognition rate of the initial HMM. Also fuzzy variants of the MLP and HMM gave good results in the tested tasks of vowel recognition.

References

  1. Dumitru, C.O.: Modele neurale si statistice pentru recunoasterea vorbirii. Ph.D. thesis, Bucharest (2006).
  2. Dumitru, C.O., Gavat, I.: Vowel, Digit and Continuous Speech Recognition Based on Statistical, Neural and Hybrid Modelling by Using ASRS_RL. EUROCON 2007, Warsaw, Poland (2007), 856-863.
  3. Gavat, I., & all: Elemente de sinteza si recunoasterea vorbirii. Ed. Printech, Bucharest (2000).
  4. Gavat, I., Dumitru, C.O., Costache, G.: Speech Signal Variance Reduction by Means of Learning Systems. MEDINF 2003, Craiova-Romania (2003), 68-69.
  5. Gavat, I, Dumitru, O., Iancu, C., Costache, G.: Learning Strategies in Speech Recognition. The 47th International Symposium ELMAR 2005, Zadar-Croatia (2005), 237-240.
  6. Goronzy, S.: Robust Adaptation to Non-Native Accents in Automatic Speech Recognition. Springer - Verlag Berlin Heidelberg, Germany (2002).
  7. Huang, X., Acero, A., Hon, H.W.: Spoken Language Processing - A Guide to Theory, Algorithm, and System Development. Prentice Hall (2001).
  8. Juanhg, B.H., Furui, S.: Automatic Recognition and Understanding of Spoken Language-A First Step Toward Natural Human-Machine Communication. Proc. IEEE, Vol.88, No.8, (2000), 1142-1165.
  9. Kohonen T.: Adaptive, Associative and Self-Organizing Function in Neural Computing. Artificial Neural Networks, IEEE Press, Piscataway- NJ (1992), 42-51.
  10. Lippmann, R. and Singer, E.: Hybrid neural network/HMM approaches to word spotting. Proc. ICASSP 7893. Minneapolis (1993), 565-568.
  11. Lippmann, R.P.: Human and Machine Performance in Speech Recognition Tasks. Speech Communications, Vol.22, No.1 (1997), 1-15.
  12. Mahomed, M. and Gader, P.: Generalized hidden Markov models. IEEE Transactions on Fuzzy Systems. (2000), 67-93.
  13. Morgan, D.P., Scotfield, C.L.: Neural Networks. Prentice Hall, New York (1992).
  14. Valsan, Z., Gavat, I., Sabac, B., Cula, O., Grigore, O., Militaru, D., Dumitru, C.O.: Statistical and Hybrid Methods for Speech Recognition in Romanian. International Journal of Speech Technology, Vol.5, No.3 (2002), 259-268.
  15. Young, S.J.: The general use of tying in phoneme-based HMM speech recognizers. Proc. ICASSP'92, Vol.1, San Francisco (1992), 569-572.
  16. Wang, Z. and Klirr, G.: Fuzzy Measure Theory. New York: Plenum (1992).
Download


Paper Citation


in Harvard Style

Dumitru C. and Gavat I. (2008). NN and Hybrid Strategies for Speech Recognition in Romanian Language . In Proceedings of the 4th International Workshop on Artificial Neural Networks and Intelligent Information Processing - Volume 1: ANNIIP, (ICINCO 2008) ISBN 978-989-8111-33-3, pages 51-60. DOI: 10.5220/0001508300510060


in Bibtex Style

@conference{anniip08,
author={Corneliu-Octavian Dumitru and Inge Gavat},
title={NN and Hybrid Strategies for Speech Recognition in Romanian Language},
booktitle={Proceedings of the 4th International Workshop on Artificial Neural Networks and Intelligent Information Processing - Volume 1: ANNIIP, (ICINCO 2008)},
year={2008},
pages={51-60},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001508300510060},
isbn={978-989-8111-33-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Workshop on Artificial Neural Networks and Intelligent Information Processing - Volume 1: ANNIIP, (ICINCO 2008)
TI - NN and Hybrid Strategies for Speech Recognition in Romanian Language
SN - 978-989-8111-33-3
AU - Dumitru C.
AU - Gavat I.
PY - 2008
SP - 51
EP - 60
DO - 10.5220/0001508300510060