segmentation. In Proceedings of the 44th IEEE Mid-
west Symposium on Circuits and Systems.
Bengio, Y. (2009). Learning deep architectures for AI.
Foundations and Trends in Machine Learning, 2(1):1–
127.
Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H.
(2007). Greedy layer-wise training of deep networks.
Advances in neural information processing systems,
19:153.
Bishop, C. M. et al. (2006). Pattern recognition and ma-
chine learning, volume 1. Springer.
Cho, K. et al. (2011). Improved learning algorithms for re-
stricted Boltzmann machines. Master’s thesis, School
of science, Aalto University.
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vin-
cent, P., and Bengio, S. (2010). Why does unsuper-
vised pre-training help deep learning? The Journal of
Machine Learning Research, 11:625–660.
Estevan, Y. P., Wan, V., and Scharenborg, O. (2007). Find-
ing maximum margin segments in speech. In Proceed-
ings of the IEEE International Conference on Acous-
tics, Speech, and Signal Processing (ICASSP).
Finster, H. (1992). Automatic speech segmentation us-
ing neural network and phonetic transcription. In
Proceedings of the International Joint Conference on
Neural Networks (IJCNN).
Fischer, A. and Igel, C. (2012). An introduction to restricted
Boltzmann machines. In Progress in Pattern Recogni-
tion, Image Analysis, Computer Vision, and Applica-
tions, pages 14–36. Springer.
Fisher, W. M., Doddington, G. R., and Goudie-Marshall,
K. M. (1986). The DARPA speech recognition re-
search database: specifications and status. In Proceed-
ings of the DARPA Workshop on Speech Recognition.
Halberstadt, A. K. (1998). Heterogeneous Acoustic Mea-
surements and Multiple Classifiers for Speech Recog-
nition. PhD thesis, Massachusetts Institute of Tech-
nology, MIT.
Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006). A fast
learning algorithm for deep belief nets. Neural com-
putation, 18(7):1527–1554.
Hoffmann, S. and Pfister, B. (2010). Fully automatic seg-
mentation for prosodic speech corpora. In Proceed-
ings of Interspeech.
Keri, V. and Prahallad, K. (2010). A comparative study
of constrained and unconstrained approaches for seg-
mentation of speech signal. In Proceedings of the In-
ternational Conference on Spoken Language Process-
ing (ICSLP).
Krizhevsky, A. and Hinton, G. (2009). Learning multiple
layers of features from tiny images. Master’s the-
sis, Department of Computer Science, University of
Toronto.
Lee, K.-S. (2006). MLP-based phone boundary refining for
a TTS database. IEEE Transactions on Audio, Speech
and Language Processing, 14(3):981–989.
Malfrere, F., Deroo, O., and Dutoit, T. (1998). Pho-
netic alignment : Speech synthesis based vs. hybrid
HMM/ANN. In Proceedings of the International Con-
ference on Spoken Language Processing (ICSLP).
Mohamed, A.-r., Dahl, G. E., and Hinton, G. (2012).
Acoustic modeling using deep belief networks. IEEE
Transactions on Audio, Speech, and Language Pro-
cessing, 20(1):14–22.
R
¨
as
¨
anen, O., Laine, U., and Altosaar, T. (2011). Blind seg-
mentation of speech using non-linear filtering meth-
ods. Speech Technologies, pages 105–124.
R
¨
as
¨
anen, O. J., Laine, U. K., and Altosaar, T. (2009). An
improved speech segmentation quality measure: the
R-value. In Proceedings of Interspeech).
Sarkar, A. and Sreenivas, T. (2005). Automatic speech seg-
mentation using average level crossing rate informa-
tion. In Proceedings of the IEEE International Con-
ference on Acoustics, Speech, and Signal Processing
(ICASSP).
Sharma, M. and Mammone, R. (1996). ‘Blind’ speech seg-
mentation: automatic segmentation of speech without
linguistic knowledge. In Proceedings of the Fourth In-
ternational Conference on Spoken Language Process-
ing (ICSLP).
Suh, Y. and Lee, Y. (1996). Phoneme segmentation of con-
tinuous speech using multi-layer perceptron. In Pro-
ceedings of the Fourth International Conference on
Spoken Language (ICSLP).
ten Bosch, L. and Cranen, B. (2007). A computational
model for unsupervised word discovery. In Proceed-
ings of Interspeech.
Toledano, D. (2000). Neural network boundary refining for
automatic speech segmentation. In Proceedings of the
IEEE International Conference on Acoustics, Speech,
and Signal Processing (ICASSP).
van Vuuren, V. Z., ten Bosch, L., and Niesler, T. (2013). A
dynamic programming framework for neural network-
based automatic speech segmentation. In Proceedings
of Interspeech.
Wang, D., Lu, L., and Zhang, H.-J. (2003). Speech segmen-
tation without speech recognition. In Proceedings of
theInternational Conference on Multimedia and Expo,
ICME.
ICPRAM2015-InternationalConferenceonPatternRecognitionApplicationsandMethods
254