CHARACTER RECOGNITION IN NATURAL IMAGES
Teófilo E. de Campos, Bodla Rakesh Babu, Manik Varma
2009
Abstract
This paper tackles the problem of recognizing characters in images of natural scenes. In particular, we focus on recognizing characters in situations that would traditionally not be handled well by OCR techniques. We present an annotated database of images containing English and Kannada characters. The database comprises of images of street scenes taken in Bangalore, India using a standard camera. The problem is addressed in an object cateogorization framework based on a bag-of-visual-words representation. We assess the performance of various features based on nearest neighbour and SVMclassification. It is demonstrated that the performance of the proposed method, using as few as 15 training images, can be far superior to that of commercial OCR systems. Furthermore, the method can benefit from synthetically generated training data obviating the need for expensive data collection and annotation.
References
- Belongie, S., Malik, J., and Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Berg, A. C., Berg, T. L., and Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. In Proc IEEE Conf on Computer Vision and Pattern Recognition, San Diego CA, June 20-25.
- Brown, M. S., Sun, M., Yang, R., yun, L., and Seales, W. B. (2007). Restoring 2d content from distorted documents. IEEE Transactions on Pattern Analysis and Machine Intelligence.
- Clark, P. and Mirmehdi, M. (2002). Recognising text in real scenes. International Journal on Document Analysis and Recognition, 4:243-257.
- Jin, Y. and Geman, S. (2006). Context and hierarchy in a probabilistic image model. In Proc IEEE Conf on Computer Vision and Pattern Recognition, New York NY, June 17-22.
- Johnson, A. E. and Herbert, M. (1999). Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5):433-449.
- Jurie, F. and Triggs, B. (2005). Creating efficient codebooks for visual recognition. In Proceedings of the IEEE International Conference on Computer Vision.
- Kise, K. and Doermann, D. S., editors (2007). Proceedings of the Second International Workshop on Camerabased Document Analysis and Recognition CBDAR, Curitiba, Brazil. http://www.imlab.jp/cbdar2007/.
- Krempp, A., Geman, D., and Amit, Y. (2002). Sequential learning of reusable parts for object detection. Technical report, Computer Science Department, Johns Hopkins University.
- Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., and Joshi, S. (2007). Text extraction and document image segmentation using matched wavelets and mrf model. IEEE Transactions on Image Processing, 16(8):2117- 2128.
- Lazebnik, S., Schmid, C., and Ponce, J. (2005). A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8):1265-1278.
- le Cun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324.
- Lowe, D. G. (1999). Object recognition from local scaleinvariante features. In Proc 7th Int Conf on Computer Vision, Corfu, Greece.
- Pal, U., Sharma, N., Wakabayashi, T., and Kimura, F. (2007). Off-line handwritten character recognition of devnagari script. In International Conference on Document Analysis and Recognition (ICDAR), pages 496- 500, Curitiba, PR, Brazil. IEEE.
- Plamondon, R. and Srihari, S. N. (2000). On-line and offline handwriting recognition: A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1):63-84.
- Tu, Z., Chen, X., Yuille, A. L., and Zhu, S. C. (2005). Image parsing: Unifying segmentation, detection, and recognition. International Journal of Computer Vision, Marr Prize Issue.
- Varma, M. and Ray, D. (2007). Learning the discriminative power-invariance trade-off. In Proceedings of the IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil.
- Varma, M. and Zisserman, A. (2002). Classifying images of materials: Achieving viewpoint and illumination independence. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, volume 3, pages 255-271. Springer-Verlag.
- Varma, M. and Zisserman, A. (2003). Texture classification: Are filter banks necessary? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
- Weinman, J. J. and Learned Miller, E. (2006). Improving recognition of novel input with similarity. In Proc IEEE Conf on Computer Vision and Pattern Recognition, New York NY, June 17-22.
- Zhang, H., Berg, A. C., Maire, M., and Malik, J. (2006). SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. In Proc IEEE Conf on Computer Vision and Pattern Recognition, New York NY, June 17-22.
Paper Citation
in Harvard Style
E. de Campos T., Rakesh Babu B. and Varma M. (2009). CHARACTER RECOGNITION IN NATURAL IMAGES . In Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2009) ISBN 978-989-8111-69-2, pages 273-280. DOI: 10.5220/0001770102730280
in Bibtex Style
@conference{visapp09,
author={Teófilo E. de Campos and Bodla Rakesh Babu and Manik Varma},
title={CHARACTER RECOGNITION IN NATURAL IMAGES},
booktitle={Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2009)},
year={2009},
pages={273-280},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001770102730280},
isbn={978-989-8111-69-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Fourth International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2009)
TI - CHARACTER RECOGNITION IN NATURAL IMAGES
SN - 978-989-8111-69-2
AU - E. de Campos T.
AU - Rakesh Babu B.
AU - Varma M.
PY - 2009
SP - 273
EP - 280
DO - 10.5220/0001770102730280