VISUAL SPEECH RECOGNITION USING WAVELET TRANSFORM AND MOMENT BASED FEATURES
Sanjay Kumar
2006
Abstract
This paper presents a novel vision based approach to identify utterances consisting of consonants. A view based method is adopted to represent the 3-D image sequence of the mouth movement in a 2-D space using grayscale images named as motion history image (MHI). MHI is produced by applying accumulative image differencing technique on the sequence of images to implicitly capture the temporal information of the mouth movement. The proposed technique combines Discrete Stationary Wavelet Transform (SWT) and image moments to classify the MHI. A 2-D SWT at level 1 is applied to decompose MHI to produce one approximate and three detail sub images. The paper reports on the testing of the classification accuracy of three different moment-based features, namely Zernike moments, geometric moments and Hu moments computed from the approximate representation of MHI. Supervised feed forward multilayer perceptron (MLP) type artificial neural network (ANN) with back propagation learning algorithm is used to classify the moment-based features. The performance and image representation ability of the three moments features are compared in this paper. The preliminary results show that all these moments can achieve high recognition rate in classification of 3 consonants.
References
- T. Goedemé , M. Nuttin, T. Tuytelaars, L. Van Gool 2005 Omnidirectional Vision based Topological Navigation 15th International Symposium on Measurement and Control in Robotics, ISMCR 2005, 2005.
- A. Nüchter, K. Lingeman, J. Hertzberg and H. Surmann 2005 Heuristic-based laser scanner matching for outJ. Zhang and K. Huebner 2002 Using symmetry as a feature in panoramic images for mobile robot applications. In VDI-Berichte 1679, GMA-Robotik 2002, Ludwigsburg 2002.
- R. Hartley and A. Zisserman 2000 Multiple View Geometry in Computer Vision. Cambridge University Press, 2000
- C. Mei 2006 http://www-sop.inria.fr/icare/.. /personnel/Christopher.Mei/index.html cited 2006/12/10.
- J. Shi and C. Tomasi Good Features to Track CVPR, Seattle, pp. 593-600, 1994.
- Kalman R.E. A new approach to linear filtering and prediction problems, Transaction of the ASME - Journal of Basic Enginnering, pp. 35-45, March 1960.
Paper Citation
in Harvard Style
Kumar S. (2006). VISUAL SPEECH RECOGNITION USING WAVELET TRANSFORM AND MOMENT BASED FEATURES . In Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO, ISBN 978-972-8865-60-3, pages 366-371. DOI: 10.5220/0001210203660371
in Bibtex Style
@conference{icinco06,
author={Sanjay Kumar},
title={VISUAL SPEECH RECOGNITION USING WAVELET TRANSFORM AND MOMENT BASED FEATURES},
booktitle={Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,},
year={2006},
pages={366-371},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001210203660371},
isbn={978-972-8865-60-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Third International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO,
TI - VISUAL SPEECH RECOGNITION USING WAVELET TRANSFORM AND MOMENT BASED FEATURES
SN - 978-972-8865-60-3
AU - Kumar S.
PY - 2006
SP - 366
EP - 371
DO - 10.5220/0001210203660371