LOW COMPLEXITY, LOW DELAY AND SCALABLE AUDIO CODING SCHEME BASED ON A NOVEL STATISTICAL PERCEPTUAL QUANTIZATION PROCEDURE

César Alonso Abad, Miguel Ángel Martín Fernández, Carlos Alberola López

2007

Abstract

In this paper we present Fast Perceptual Quantization (FPQ), a novel procedure to quantize and code audio signals. It employs the same psychoacoustics principles used in the popular MPEG/Audio coders, but substantially simplifies the complexity and computational needs of the encoding process. FPQ is based on defining a hierarchy of privileged quantization values so that the masking threshold calculated through a psychoacoustic model is leveraged to quantize the real values to the privileged ones when possible. The computational cost of this process is very low compared to MP3’s or AAC’s quantization/coding loops. Experimental results show that it is possible to achieve nearly transparent coding using as few as approximately 100 quantization values. This leads to very efficient bit compaction using Huffman or arithmetic coding so that nearly state-of-the-art performance can be achieved in terms of quality/bit-rate trade-off. Since quantization and codification (bit compaction) procedures are completely independent here, efficient scalable decoding can be achieved either by parsing and entropy re-encoding the original quantized values or by coding the bit-planes independently and sorting them in order of perceptual significance. Very low delay performance is also possible to achieve, which makes the proposed coding scheme suitable for real-time applications.

References

  1. Bosi, M., Brandenburg, K., Quackenbush, S., Fielder, L., Akagiri, K., Fuchs, H., Dietz, M., Herre, J., Davidson, G., and Oikawa, Y. (1997). ISO/IEC MPEG-2 advanced audio coding. J. Audio Eng. Soc., 45(10):789- 814.
  2. Derrien, O., Duhamel, P., Charbit, M., and Richard, G. (2006). A New Quantization Optimization Algorithm for the MPEG Advanced Audio Coder Using a Statistical Subband Model of the Quantization Noise. IEEE Transactions on Audio, Speech and Language Processing, 14(4):1328-1339.
  3. ISO/MPEG (1992). Information technology-Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s-Part 3: Audio. IS11172-3 (MPEG-1) .
  4. Kramer, U., Schuller, G., Wabnik, S., Klier, J., and Hirschfeld, J. (2004). Ultra Low Delay audio coding with constant bit rate. 117th AES Convention.
  5. Wabnik, S., Schuller, G., Hirschfeld, J., and Kraemer, U. (2006). Different quantisation noise shaping methods for predictive audio coding. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse.
Download


Paper Citation


in Harvard Style

Alonso Abad C., Ángel Martín Fernández M. and Alberola López C. (2007). LOW COMPLEXITY, LOW DELAY AND SCALABLE AUDIO CODING SCHEME BASED ON A NOVEL STATISTICAL PERCEPTUAL QUANTIZATION PROCEDURE . In Proceedings of the Second International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2007) ISBN 978-989-8111-13-5, pages 31-34. DOI: 10.5220/0002140100310034


in Bibtex Style

@conference{sigmap07,
author={César Alonso Abad and Miguel Ángel Martín Fernández and Carlos Alberola López},
title={LOW COMPLEXITY, LOW DELAY AND SCALABLE AUDIO CODING SCHEME BASED ON A NOVEL STATISTICAL PERCEPTUAL QUANTIZATION PROCEDURE},
booktitle={Proceedings of the Second International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2007)},
year={2007},
pages={31-34},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002140100310034},
isbn={978-989-8111-13-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Second International Conference on Signal Processing and Multimedia Applications - Volume 1: SIGMAP, (ICETE 2007)
TI - LOW COMPLEXITY, LOW DELAY AND SCALABLE AUDIO CODING SCHEME BASED ON A NOVEL STATISTICAL PERCEPTUAL QUANTIZATION PROCEDURE
SN - 978-989-8111-13-5
AU - Alonso Abad C.
AU - Ángel Martín Fernández M.
AU - Alberola López C.
PY - 2007
SP - 31
EP - 34
DO - 10.5220/0002140100310034