loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Petr Pollak and Martin Behunek

Affiliation: Czech Technical University in Prague, Czech Republic

Keyword(s): Speech recognition, MPEG compression, MP3, Noise robustness, Channel distortion.

Related Ontology Subjects/Areas/Topics: Audio and Video Quality Assessment ; MPEG Standards and Related Issues ; Multimedia ; Multimedia and Communications ; Multimedia Databases, Indexing, Recognition and Retrieval ; Multimedia Systems and Applications ; Telecommunications

Abstract: This paper presents the study of speech recognition accuracy with respect to different levels of MP3 compression. Special attention is focused on the processing of speech signals with different quality, i.e. with different level of background noise and channel distortion. The work was motivated by possible usage of ASR for off-line automatic transcription of audio recordings collected by standard wide-spread MP3 devices. The realized experiments have proved that although MP3 format is not optimal for speech compression it does not distort speech significantly especially for high or moderate bit rates and high quality of source data. The accuracy of connected digits ASR decreased consequently very slowly up to the bit rate 24 kbps. For the best case of PLP parameterization in close-talk channel just 3% decrease of recognition accuracy was observed while the size of the compressed file was approximately 10% of the original size. All results were slightly worse under presence of additi ve background noise and channel distortion in a signal but achieved accuracy was also acceptable in this case especially for PLP features. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 34.239.148.106

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Pollak, P. and Behunek, M. (2011). ACCURACY OF MP3 SPEECH RECOGNITION UNDER REAL-WORD CONDITIONS - Experimental Study. In Proceedings of the International Conference on Signal Processing and Multimedia Applications (ICETE 2011) - SIGMAP; ISBN 978-989-8425-72-0, SciTePress, pages 5-10. DOI: 10.5220/0003512600050010

@conference{sigmap11,
author={Petr Pollak. and Martin Behunek.},
title={ACCURACY OF MP3 SPEECH RECOGNITION UNDER REAL-WORD CONDITIONS - Experimental Study},
booktitle={Proceedings of the International Conference on Signal Processing and Multimedia Applications (ICETE 2011) - SIGMAP},
year={2011},
pages={5-10},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003512600050010},
isbn={978-989-8425-72-0},
}

TY - CONF

JO - Proceedings of the International Conference on Signal Processing and Multimedia Applications (ICETE 2011) - SIGMAP
TI - ACCURACY OF MP3 SPEECH RECOGNITION UNDER REAL-WORD CONDITIONS - Experimental Study
SN - 978-989-8425-72-0
AU - Pollak, P.
AU - Behunek, M.
PY - 2011
SP - 5
EP - 10
DO - 10.5220/0003512600050010
PB - SciTePress