TEXT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS

Manolis Delakis, Christophe Garcia

2008

Abstract

Text detection is an important preliminary step before text can be recognized in unconstrained image environments. We present an approach based on convolutional neural networks to detect and localize horizontal text lines from raw color pixels. The network learns to extract and combine its own set of features through learning instead of using hand-crafted ones. Learning was also used in order to precisely localize the text lines by simply training the network to reject badly-cut text and without any use of tedious knowledge-based post-processing. Although the network was trained with synthetic examples, experimental results demonstrated that it can outperform other methods on the real-world test set of ICDAR’03.

References

  1. Chen, D., Odobez, J.-M., and Bourlard, H. (2004). Text detection and recognition in images and video frames. Pattern Recognition, 37(5):595-608.
  2. Garcia, C. and Apostolidis, X. (2000). Text detection and segmentation in complex color images. In Proceedings of ICASSP'00, pages 2326-2329, Washington, DC, USA. IEEE Computer Society.
  3. Garcia, C. and Delakis, M. (2004). Convolutional face finder: A neural architecture for fast and robust face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11):1408-1423.
  4. Jung, K. (2001). Neural network-based text location in color images. Pattern Recognition Letters, 22(14):1503-1515.
  5. Jung, K., Kim, K. I., and Jain, A. (2004). Text information extraction in images and video: a survey. Pattern Recognition, 37(5):977-997.
  6. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324.
  7. Li, H., Doermann, D., and Kia, O. (2000). Automatic text detection and tracking in digital videos. IEEE Transactions on Image Processing, 9(1):147-156.
  8. Lucas, S., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, H., Okamoto, M., Yamamoto, H., Miyao, H., Z., J., Ou, W.-W., Wolf, C., Jolion, J.-M., Todoran, L., Worring, M., and Lin, X. (2005). ICDAR 2003 robust reading competitions: entries, results, and future directions. International Journal on Document Analysis and Recognition, 7(2- 3):105-122.
Download


Paper Citation


in Harvard Style

Delakis M. and Garcia C. (2008). TEXT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS . In Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008) ISBN 978-989-8111-21-0, pages 290-294. DOI: 10.5220/0001079902900294


in Bibtex Style

@conference{visapp08,
author={Manolis Delakis and Christophe Garcia},
title={TEXT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS},
booktitle={Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)},
year={2008},
pages={290-294},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001079902900294},
isbn={978-989-8111-21-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)
TI - TEXT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS
SN - 978-989-8111-21-0
AU - Delakis M.
AU - Garcia C.
PY - 2008
SP - 290
EP - 294
DO - 10.5220/0001079902900294