loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Van Zyl van Vuuren 1 ; Louis ten Bosch 2 and Thomas Niesler 1

Affiliations: 1 University of Stellenbosch, South Africa ; 2 Radboud University, Netherlands

Keyword(s): Unconstrained Automatic Speech Segmentation, Deep Neural Networks, Generative Pre-training.

Related Ontology Subjects/Areas/Topics: Applications ; Artificial Intelligence ; Audio and Speech Processing ; Biomedical Engineering ; Biomedical Signal Processing ; Computational Intelligence ; Digital Signal Processing ; Health Engineering and Technology Applications ; Human-Computer Interaction ; Methodologies and Methods ; Multimedia ; Multimedia Signal Processing ; Neural Networks ; Neurocomputing ; Neurotechnology, Electronics and Informatics ; Pattern Recognition ; Physiological Computing Systems ; Sensor Networks ; Signal Processing ; Soft Computing ; Software Engineering ; Telecommunications ; Theory and Methods

Abstract: We propose a method for improving the unconstrained segmentation of speech into phoneme-like units using deep neural networks. The proposed approach is not dependent on acoustic models or forced alignment, but operates using the acoustic features directly. Previous solutions of this type were plagued by the tendency to hypothesise additional incorrect phoneme boundaries near the phoneme transitions. We show that the application of deep neural networks is able to reduce this over-segmentation substantially, and achieve improved segmentation accuracies. Furthermore, we find that generative pre-training offers an additional benefit.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.218.2.191

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Zyl van Vuuren, V.; ten Bosch, L. and Niesler, T. (2015). Unconstrained Speech Segmentation using Deep Neural Networks. In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM; ISBN 978-989-758-076-5; ISSN 2184-4313, SciTePress, pages 248-254. DOI: 10.5220/0005201802480254

@conference{icpram15,
author={Van {Zyl van Vuuren}. and Louis {ten Bosch}. and Thomas Niesler.},
title={Unconstrained Speech Segmentation using Deep Neural Networks},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM},
year={2015},
pages={248-254},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005201802480254},
isbn={978-989-758-076-5},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM
TI - Unconstrained Speech Segmentation using Deep Neural Networks
SN - 978-989-758-076-5
IS - 2184-4313
AU - Zyl van Vuuren, V.
AU - ten Bosch, L.
AU - Niesler, T.
PY - 2015
SP - 248
EP - 254
DO - 10.5220/0005201802480254
PB - SciTePress