A Holistic Method to Recognize Characters in Natural Scenes

Muhammad Ali, Hassan Foroosh

Abstract

Local features like Histogram of Gradients (HoG), Shape Contexts (SC) etc. are normally used by research community concerned with text recognition in natural scene images. The main issue that comes with this approach is ad hoc rasterization of feature vector which can disturb global structural and spatial correlations while constructing feature vector. Moreover, such approaches, in general, don’t take into account rotational invariance property that often leads to failed recognition in cases where characters occur in rotated positions in scene images. To address local feature dependency and rotation problems, we propose a novel holistic feature based on active contour model, aka snakes. Our feature vector is based on two variables, direction and distance, cumulatively traversed by each point as the initial circular contour evolves under the force field induced by the image. The initial contour design in conjunction with cross-correlation based similarity metric enables us to account for rotational variance in the character image. We use various datasets, including synthetic and natural scene character datasets, like Chars74K-Font, Chars74K-Image, and ICDAR2003 to compare results of our approach with several baseline methods and show better performance than methods based on local features (e.g. HoG). Our leave-random-one-out-cross validation yields even better recognition performance, justifying our approach of using holistic character recognition.

References

  1. Ali, M., and Foroosh, H., 2015. Natural Scene Character Recognition without Dependency on Specific Features. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Berlin, Germany, March 2015.
  2. Chen, X., and Yuille, A., 2004. Detecting and reading text in natural scenes. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. IEEE 2004. Vol. 2. pp. II-366.
  3. Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D., and Ng, A., 2011. Text detection and character recognition in scene images with unsupervised feature learning. In International Conference on Document Analysis and Recognition (ICDAR), 2011. IEEE 2011, pp. 440-445.
  4. Dalal, N., and Triggs, B., 2005. Histograms of oriented gradients for human detection. In International Conference on Computer Vision and Pattern Recognition (CVPR) 2005. IEEE 2005, pp.886-893.
  5. de Campos, T. E., Babu, B. R., and Varma, M., 2009. Character recognition in natural images. In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, February 2009. pp. 273-280.
  6. Donoser, M., Bischof, H., and Wagner, S., 2008. Using web search engines to improve text recognition. In 19th International Conference on Pattern Recognition, ICPR 2008. Vol. no. 14, pp. 8-11.
  7. Hazan, T., Polak, S., and Shashua, A., 2005. Sparse Image Coding using a 3D Non-negative Tensor Factorization. In International Conference on Computer Vision (ICCV), 2005. IEEE 2005. Vol. 1, pp. 50-57.
  8. Field, J., and Learned-Miller, E., 2013. Improving OpenVocabulary Scene Text Recognition. In International Conference on Document Analysis and Recognition (ICDAR) 2013. IEEE 2013, pp. 604-608.
  9. Ivins, J., and Porrill J., 2000. Everything you always wanted to know about snakes. AIVRU Technical Memo 86, July 1993 (Revised June 1995; March 2000).
  10. Kass, M., Witkin, A., and Terzopoulos, D. 1987. Snakes: Active contour models. International Journal of Computer Vision. v. 1, n. 4, pp. 321-331.
  11. Kita, K., and Wakahara, T., 2010. Binarization of color characters in scene images using k-means clustering and support vector machines. In International Conference on Pattern Recognition (ICPR), 2010. IEEE 2010, pp. 3183-3186.
  12. Lucas, S. M., Panaretos, A., Sosa, L., Tang, A., Wong, S., and Young, R., 2003. ICDAR 2003 robust reading competitions. In Proceedings of the Seventh International Conference on Document Analysis and Recognition 2003. IEEE 2003. Vol. 2, pp. 682-687.
  13. Mishra, A., Alahari, K., and Jawahar, C., 2011. An MRF model for binarization of natural scene text. In International Conference on Document Analysis and Recognition (ICDAR), 2011. IEEE 2011, pp. 11-16.
  14. Nagy, R., Dicker, A., and Meyer-Wegener, K., 2011. NEOCR: A Configurable Dataset for Natural Image Text Recognition. In CBDAR Workshop, ICDAR 2011, pp. 53-58.
  15. Neumann, L., and Matas, J., 2011. A method for text localization and recognition in real-world images. In Computer Vision-ACCV 2010, pp. 770-783.
  16. Niblack, W., 1985. An introduction to digital image processing. Strandberg Publishing Company.
  17. Otsu, N., 1979. A Threshold Selection Method from GrayLevel Histogram. In Trans. System, Man and Cybernetics. IEEE 1979. Vol.9, pp.62-69.
  18. Wang, T., Wu, D., Coates, A., and Ng, A., 2012. End-toEnd Text Recognition with Convolutional Neural Networks. In International Conference on Pattern Recognition (ICPR), 2012. IEEE 2012, pp. 330.
  19. Wang, K., Babenko, B., and Belongie, S., 2011. End-to-end scene text recognition. In International Conference Computer Vision (ICCV), 2011. IEEE 2011, pp. 1457- 1464.
  20. Wang, K., and Belongie, S., 2010. Word spotting in the wild. In Computer Vision-ECCV 2010, pp. 591-604.
  21. Weinman, J., Learned-Miller, E., and Hanson, A., 2009. Scene text recognition using similarity and a lexicon with sparse belief propagation. In Pattern Analysis and Machine Intelligence TPAMI. IEEE Transactions 2009. Vol. 31, no. 10, pp. 1733-1746.
  22. Xu, C., and Prince, J. L., 1998. Snakes, Shape, and Gradient Vector Flow. IEEE Transactions on Image Processing, 1998.
  23. Xu, C., and Prince, J. L., 1997. Gradient Vector Flow: A New External Force for Snakes. In Proc. IEEE Conf. on Comp. Vis. Patt. Recog. (CVPR), Los Alamitos: Comp. Soc. Press, pp. 66-71.
  24. Yi, C., and Tian, Y., 2014. Scene text recognition in mobile applications by character descriptor and structure configuration, IEEE Trans. IP, pp 2972-2982, 2014.
Download


Paper Citation


in Harvard Style

Ali M. and Foroosh H. (2016). A Holistic Method to Recognize Characters in Natural Scenes . In Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016) ISBN 978-989-758-175-5, pages 449-457. DOI: 10.5220/0005787904490457


in Bibtex Style

@conference{visapp16,
author={Muhammad Ali and Hassan Foroosh},
title={A Holistic Method to Recognize Characters in Natural Scenes},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)},
year={2016},
pages={449-457},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005787904490457},
isbn={978-989-758-175-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2016)
TI - A Holistic Method to Recognize Characters in Natural Scenes
SN - 978-989-758-175-5
AU - Ali M.
AU - Foroosh H.
PY - 2016
SP - 449
EP - 457
DO - 10.5220/0005787904490457