Vowel-consonant Speech Segmentation by Neuromorphic Units

Pedro Gómez-Vilda, Roberto Fernández-Baíllo, Victoria Rodellar-Biarge, José Manuel Ferrández-Vicente

2011

Abstract

For the time being speech is still a much complex process far from being fully understood. To gain some insight on specific open problems in its automatic treatment (recognition, synthesis, diarization, segmentation, etc.) neuromorphisms and knowledge derived from the understanding on how the Auditory System proceeds may be of crucial importance. The present paper must be seen as in a series of preliminary work carried out trying to translate some of this understanding to solve specific tasks as speech segmentation and labelling in a parallel way to the neural resources found in the Auditory Pathways and Cortex. The bio-inspired (neuromorphic) design of some elementary units covering simple tasks as formant tracking or formant dynamics is exposed. In a further step it is shown how simply neural circuits employing these units may convey successful vowel-consonant separation independently of the speaker. The paper is completed with the discussion on how this processing may be used to develop specific applications as in Speech Segmentation and Diarization and in Speaker Characterization.

References

  1. Allen, J. B., 2008. Nonlinear Cochlear Signal Processing and Masking in Speech Perception. In Springer Handbook of Speech Processing (Chapter 3), Eds.: J. Benesty, M. M. Sondhi and Y. Huang, Springer Verlag, pp. 27-60.
  2. IPA: http://www.arts.gla.ac.uk/IPA/ipachart.html
  3. Geissler, D. B and Ehret, G., 2002. Time-critical integration of formants for perception of communication calls in mice. Proc. of the Nat. Ac. of Sc. 99-13 pp. 9021-9025.
  4. Gómez, P., Ferrández, J. M., Rodellar, V., Fernández, R., 2009a. Time-frequency Representations in Speech Perception, Neurocomputing 72 820-830. Gómez, P., Ferrández, J. M., Rodellar, V., Álvarez, A., Mazaira, L. M., Martínez, R., 2009b. Detection of Speech Dynamics by Neuromorphic Units. Lecture Notes on Computer Science 5602, Springer Verlag pp. 67-78.
  5. Gómez, P., Ferrández, J. M., Rodellar, V., Mazaira, L. M. and Muñoz, C., 2010. Modeling Short-Time Parsing of Speech Features in Neocortical Structures. Lecture Notes in Artificial Intelligence, 6098, Springer Verlag, pp. 159-168.
  6. Greenberg, S. and Ainsworth, W. H., 2006. Auditory Processing of Speech. In Greenberg, S. and Ainsworth, W. H., Listening to Speech: An Auditory Perspective. Lawrence Erbaum Associates, pp. 3-17.
  7. Greenberg, S., and Ainsworth, W. H., 2004. Speech Processing in the Auditory System: An Overview. In W. A. S. Greenberg, Speech Processing in the Auditory System. Springer, New York, pp. 1-62.
  8. Hebb, D. O., 1949. The Organization of Behavior (Wiley Interscience New York 1949 - reprinted 2002).
  9. H. Hermansky, “Should recognizers have ears?” Speech Communication, vol. 25, pp. 3-27, Aug. 1998.
  10. Jähne, B., (2005). Digital Image Processing. Springer, Berlin.
  11. Mountcastle, V. B., 1997. The columnar organization of the neocortex. Brain 120 pp. 701- 722.
  12. Munkong, R. and Juang, B. H., 2008. Auditory Perception and Cognition. IEEE Signal Proc. Magazine 98 pp. 98-117.
  13. Rauschecker, J. P., & Scott, S. K., 2009. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nature Neuroscience 12-6 pp. 718-724.
  14. Suga, N., 2006. Basic Acoustic Patterns and Neural Mechanisms Shared by Humans and Animals for Auditory Perception. In Greenberg, S. and Ainsworth, W. H., Listening to Speech: An Auditory Perspective. Lawrence Erbaum Associates pp. 159-181.
  15. Sussman, H. M., McCaffrey, H. A., and Mathews, S. A., 1991. An Investigation of Locus Equations as a Source of Relational Invariance for Stop Place Categorization, Journal of the Acoustical Society of America 90 pp. 1309-1325.
Download


Paper Citation


in Harvard Style

Gómez-Vilda P., Fernández-Baíllo R., Rodellar-Biarge V. and Ferrández-Vicente J. (2011). Vowel-consonant Speech Segmentation by Neuromorphic Units . In Proceedings of the 1st International Workshop on AI Methods for Interdisciplinary Research in Language and Biology - Volume 1: BILC, (ICAART 2011) ISBN 978-989-8425-42-3, pages 14-27. DOI: 10.5220/0003306000140027


in Bibtex Style

@conference{bilc11,
author={Pedro Gómez-Vilda and Roberto Fernández-Baíllo and Victoria Rodellar-Biarge and José Manuel Ferrández-Vicente},
title={Vowel-consonant Speech Segmentation by Neuromorphic Units },
booktitle={Proceedings of the 1st International Workshop on AI Methods for Interdisciplinary Research in Language and Biology - Volume 1: BILC, (ICAART 2011)},
year={2011},
pages={14-27},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003306000140027},
isbn={978-989-8425-42-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Workshop on AI Methods for Interdisciplinary Research in Language and Biology - Volume 1: BILC, (ICAART 2011)
TI - Vowel-consonant Speech Segmentation by Neuromorphic Units
SN - 978-989-8425-42-3
AU - Gómez-Vilda P.
AU - Fernández-Baíllo R.
AU - Rodellar-Biarge V.
AU - Ferrández-Vicente J.
PY - 2011
SP - 14
EP - 27
DO - 10.5220/0003306000140027