Inductive String Template-Based Learning of Spoken Language
Alexander Gutkin, Simon King
2005
Abstract
This paper deals with formulation of alternative structural approach to the speech recognition problem. In this approach, we require both the representation and the learning algorithms defined on it to be linguistically meaningful, which allows the speech recognition system to discover the nature of the linguistic classes of speech patterns corresponding to the speech waveforms. We briefly discuss the current formalisms and propose an alternative — a phonologically inspired string-based inductive speech representation, defined within an analytical framework specifically designed to address the issues of class and object representation. We also present the results of the phoneme classification experiments conducted on the TIMIT corpus of continuous speech.
References
- Goldfarb, L., Deshpande, S.S., Bhavsar, C.: Inductive Theory of Vision. Technical Report TR96-108, Faculty of Computer Science, University of New Brunswick, Canada (1996)
- King, S., Taylor, P.: Detecting phonological features in continuous speech using neural networks. Computer Speech and Language 14 (2000) 333-353
- Wester, M.: Syllable classification using articulatory acoustic features. In: Proc. Eurospeech, Geneva (2003) 233-236
- Gutkin, A., King, S.: Structural Representation of Speech for Phonetic Classification. In: Proc. 17th ICPR. Volume 3., Cambridge, UK (2004) 438-441
- Goldfarb, L.: On the foundations of intelligent processes - I. An evolving model for pattern learning. Pattern Recognition 23 (1990) 595-616
- Goldfarb, L.: What is distance and why do we need the metric model for pattern learning ? Pattern Recognition 25 (1992) 431-438
- Goldfarb, L., Nigam, S.: The Unified Learning Paradigm: A Foundation for AI. In Honavar, V., Uhr, L., eds.: Artificial Intelligence and Neural Networks: Steps toward Principled Integration. Academic Press, Boston (1994) 533-559
- Jakobson, R., Fant, G.M., Halle, M.: Preliminaries to Speech Analysis: The distinctive features and their correlates. MIT Press, Cambridge, MA (1963)
- Chomsky, N., Halle, M.: The Sound Pattern of English. MIT Press, Cambridge, MA (1968)
- Lange, L.H.: 10. In: Elementary Linear Algebra. John Wiley & Sons, New York (1968)
- Abela, J.M.: ETS Learning of Kernel Languages. PhD thesis, Faculty of Computer Science, University of New Brunswick, Canada (2001)
- Garofolo, J.S.: Getting Started with the DARPA TIMIT CD-ROM: an Acoustic Phonetic Continuous Speech Database. National Institute of Standards and Technology (NIST), Gaithersburgh, Maryland. (1988)
- Juan, A., Vidal, E.: On the Use of Normalized Edit Distances and an Efficient k-NN Search Technique (k-AESA) for Fast and Accurate String Classification. In: Proc. 15th ICPR. Volume 2. (2000) 680-683
- Kondrak, G.: A New Algorithm for the Alignment of Phonetic Sequences. In: Proceedings of the First Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2000), Seattle (2000) 288-295
Paper Citation
in Harvard Style
Gutkin A. and King S. (2005). Inductive String Template-Based Learning of Spoken Language . In Proceedings of the 5th International Workshop on Pattern Recognition in Information Systems - Volume 1: PRIS, (ICEIS 2005) ISBN 972-8865-28-7, pages 43-51. DOI: 10.5220/0002568800430051
in Bibtex Style
@conference{pris05,
author={Alexander Gutkin and Simon King},
title={Inductive String Template-Based Learning of Spoken Language},
booktitle={Proceedings of the 5th International Workshop on Pattern Recognition in Information Systems - Volume 1: PRIS, (ICEIS 2005)},
year={2005},
pages={43-51},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002568800430051},
isbn={972-8865-28-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 5th International Workshop on Pattern Recognition in Information Systems - Volume 1: PRIS, (ICEIS 2005)
TI - Inductive String Template-Based Learning of Spoken Language
SN - 972-8865-28-7
AU - Gutkin A.
AU - King S.
PY - 2005
SP - 43
EP - 51
DO - 10.5220/0002568800430051