ON THE CONTRIBUTION OF COMPRESSION TO VISUAL PATTERN RECOGNITION
Gunther Heidemann, Helge Ritter
2008
Abstract
Most pattern recognition problems are solved by highly task specific algorithms. However, all recognition and classification architectures are related in at least one aspect: They rely on compressed representations of the input. It is therefore an interesting question how much compression itself contributes to the pattern recognition process. The question has been answered by Benedetto et al. (2002) for the domain of text, where a common compression program (gzip ) is capable of language recognition and authorship attribution. The underlying principle is estimating the mutual information from the obtained compression factor. Here we show that compression achieves astonishingly high recognition rates even for far more complex tasks: Visual object recognition, texture classification, and image retrieval. Though, naturally, specialized recognition algorithms still outperform compressors, our results are remarkable, since none of the applied compression programs (gzip , bzip2 ) was ever designed to solve this type of tasks. Compression is the only known method that solves such a wide variety of tasks without any modification, data preprocessing, feature extraction, even without parametrization. We conclude that compression can be seen as the “core” of a yet to develop theory of unified pattern recognition.
References
- Ball, P. (2002). Algorithm makes tongue tree. Nature Science Update.
- Benedetto, D., Caglioti, E., and Loreto, V. (2002). Language Trees and Zipping. Phys. Rev. Lett., 88(4).
- Benedetto, D., Caglioti, E., and Loreto, V. (2003). Zipping out relevant information. Computing in Science and Engineering, 5:80-85.
- Burrows, M. and Wheeler, D. J. (1994). A Block-sorting Lossless Data Compression Algorithm. Research Report 124, Digital Systems Research Center.
- Cho, A. (2002). Reading the Bits of Shakespeare. ScienceNOW.
- Corel (1997). Corel GALLERYTM Magic 65000. Corel Corp., 1600 Carling Ave., Ottawa, Ontario, Canada K1Z 8R7.
- Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory. Wiley, New York.
- Erdogmus, D., Hild, K. E., Rao, Y. N., and Príncipe, J. C. (2004). Minimax Mutual Information Approach for Independent Component Analysis. Neural Computation, 16(6):1235-1252.
- Hirschberg, D. S. and Lelewer, D. A. (1990). Efficient Decoding of Prefix Codes. Communications of the ACM, 33(4):449-459.
- Hulle, M. M. V. (2002). Joint Entropy Maximization in Kernel-Based Topographic Maps. Neural Computation, 14(8):1887-1906.
- Imaoka, H. and Okajima, K. (2004). An Algorithm for the Detection of Faces on the Basis of Gabor Features and Information Maximization. Neural Computation, 16(6):1163-1191.
- Kanungo, T., Dom, B., Niblack, W., and Steele, D. (1994). A fast algorithm for MDL-based multi-band image segmentation. In Proc. Conf. Computer Vision and Pattern Recognition CVPR.
- Keeler, A. (1990). Minimal length encoding of planar subdivision topologies with application to image segmentation. In AAAI 1990 Spring Symposium of the Theory and Application of Minimal Length Encoding.
- Khmelev, D. V. and Teahan, W. J. (2003). Comment on ”Language Trees and Zipping”. Physical Review Letters, 90(8):089803-1.
- Laaksonen, J. T., Koskela, J. M., Laakso, S. P., and Oja, E. (2000). PicSOM - Content-Based Image Retrieval with Self-Organizing Maps. Pattern Recognition Letters, 21(13-14):1199-1207.
- Leclerc, Y. G. (1989). Constructing simple stable descriptions for image partitioning. Int'l J. of Computer Vision, 3:73-102.
- Lempel, A. and Ziv, J. (1977). A Universal Algorithm for Sequential Data Compression. IEEE Trans. Inf. Th., 23(3):337-343.
- Murase, H. and Nayar, S. K. (1995). Visual Learning and Recognition of 3-D Objects from Appearance. Int'l J. of Computer Vision, 14:5-24.
- Nene, S. A., Nayar, S. K., and Murase, H. (1996). Columbia Object Image Library: COIL-100. Technical Report CUCS-006-96, Dept. Computer Science, Columbia Univ.
- Paulus, D., Ahrlichs, U., Heigl, B., Denzler, J., Hornegger, J., Zobel, M., and Niemann, H. (2000). Active Knowledge-Based Scene Analysis. Videre, 1(4).
- Picard, R., Graczyk, C., Mann, S., Wachman, J., Picard, L., and Campbell, L. (1995). Vision Texture Database (VisTex). Copyright 1995 by the Massachusetts Institute of Technology.
- Rissanen, J. (1978). Modeling by Shortest Data Description. Automatica, 14:465-471.
- Rui, Y., Huang, T. S., and Chang, S.-F. (1999). Image Retrieval: Current Techniques, Promising Directions and Open Issues. J. of Visual Communications and Image Representation, 10:1-23.
- Sinkkonen, J. and Kaski, S. (2002). Clustering Based on Conditional Distributions in an Auxiliary Space. Neural Computation, 14(1):217-239.
- Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000). Content-Based Image Retrieval at the End of the Early Years. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(12):1349- 1380.
- Tarr, M. J. and Bülthoff, H. H. (1998). Image-Based Object Recognition in Man, Monkey and Machine. Cognition, 67:1-20.
- Vitanyi, P. M. B. and Li, M. (1996). Ideal MDL and its Relation to Bayesianism. In Proc. ISIS: Information, Statistics and Induction in Science World Scientific, pages 282-291, Singapore.
- Wyner, A. D. (1994). 1994 Shannon Lecture. Typical Sequences and All That: Entropy, Pattern Matching, and Data Compression. AT & T Bell Laboratories, Murray Hill, New Jersey, USA.
Paper Citation
in Harvard Style
Heidemann G. and Ritter H. (2008). ON THE CONTRIBUTION OF COMPRESSION TO VISUAL PATTERN RECOGNITION . In Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008) ISBN 978-989-8111-21-0, pages 83-89. DOI: 10.5220/0001078000830089
in Bibtex Style
@conference{visapp08,
author={Gunther Heidemann and Helge Ritter},
title={ON THE CONTRIBUTION OF COMPRESSION TO VISUAL PATTERN RECOGNITION},
booktitle={Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)},
year={2008},
pages={83-89},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001078000830089},
isbn={978-989-8111-21-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)
TI - ON THE CONTRIBUTION OF COMPRESSION TO VISUAL PATTERN RECOGNITION
SN - 978-989-8111-21-0
AU - Heidemann G.
AU - Ritter H.
PY - 2008
SP - 83
EP - 89
DO - 10.5220/0001078000830089