Automatic Text Localisation in Scanned Comic Books

Christophe Rigaud, Dimosthenis Karatzas, Joost van de Weijer, Jean Christophe Burie, Jean-Marc Ogier


Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.


  1. Arai, K. and Tolle, H. (2011). Method for real time text extraction of digital manga comic. International Journal of Image Processing (IJIP), 4(6):669-676.
  2. Clavelli, A. and Karatzas, D. (2009). Text segmentation in colour posters from the spanish civil war era. In Proceedings of the 2009 10th International Conference on Document Analysis and Recognition, ICDAR 7809, pages 181-185, Washington, DC, USA. IEEE Computer Society.
  3. Epshtein, B., Ofek, E., and Wexler, Y. (2010). Detecting text in natural scenes with stroke width transform. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 2963 -2970.
  4. Jung, K., Kim, K. I., and Jain, A. K. (2004). Text information extraction in images and video: a survey. Pattern Recognition, 37(5):977 - 997.
  5. Karatzas, D. and Antonacopoulos, A. (2007). Colour text segmentation in web images based on human perception. Image and Vision Computing, 25(5):564 - 577.
  6. Karatzas, D., Mestre, S. R., Mas, J., Nourbakhsh, F., and Roy, P. P. (2011). Icdar 2011 robust reading competition - challenge 1: Reading text in born-digital images (web and email). International Conference on Document Analysis and Recognition, 0:1485-1490.
  7. Kim, W. and Kim, C. (2009). A new approach for overlay text detection and extraction from complex video scene. Image Processing, IEEE Transactions on, 18(2):401 -411.
  8. Matas, J., Chum, O., Urban, M., and Pajdla, T. (2002). Robust wide baseline stereo from. In In British Machine Vision Conference, pages 384-393.
  9. Matsui, Y., Yamasaki, T., and Aizawa, K. (2011). Interactive manga retargeting. In ACM SIGGRAPH 2011 Posters, SIGGRAPH 7811, pages 35:1-35:1, New York, NY, USA. ACM.
  10. Meng, Q. and Song, Y. (2012). Text detection in natural scenes with salient region. In Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on, pages 384 -388.
  11. Neumann, L. and Matas, J. (2012). Real-time scene text localization and recognition. Computer Vision and Pattern Recognition, pages 1485-1490.
  12. Oliveira, D. M. and Lins, R. D. (2010). Generalizing tableau to any color of teaching boards. In Proceedings of the 2010 20th International Conference on Pattern Recognition, ICPR 7810, pages 2411-2414, Washington, DC, USA. IEEE Computer Society.
  13. Otsu, N. (1979). A threshold selection method from graylevel histograms. IEEE Transactions on Systems, Man and Cybernetics, 9(1):62-66.
  14. Rigaud, C., Tsopze, N., Burie, J.-C., and Ogier, J.- M. (2012). Robust frame and text extraction from comic books. Lecture Note for Computer Science GREC2011, 7423(19).
  15. Roudier, N. (2011). Les Terres Creusees, volume Acte sur BD. Actes Sud.
  16. Shivakumara, P., Phan, T., and Tan, C. L. (2009). A robust wavelet transform based technique for video text detection. In Document Analysis and Recognition, 2009. ICDAR 7809. 10th International Conference on, pages 1285 -1289.
  17. Su, C.-Y., Chang, R.-I., and Liu, J.-C. (2011). Recognizing text elements for svg comic compression and its novel applications. In Proceedings of the 2011 International Conference on Document Analysis and Recognition, ICDAR 7811, pages 1329-1333, Washington, DC, USA. IEEE Computer Society.
  18. Thotreingam Kasar, Jayant Kumar, and Ramakrishnan, A. G. (2007). Font and Background Color Independent Text Binarization. In Intl. workshop on Camera Based Document Analysis and Recognition (workshop of ICDAR), pages 3-9.
  19. Tombre, K., Tabbone, S., Plissier, L., Lamiroy, B., and Dosch, P. (2002). Text/graphics separation revisited. In in: Workshop on Document Analysis Systems (DAS, pages 200-211. Springer-Verlag.
  20. Tsopze, N., Guérin, C., Bertet, K., and Revel, A. (2012). Ontologies et relations spatiales dans la lecture d'une bande dessinée. In Ingénierie des Connaissances, pages 175-182, Paris.
  21. Wang, K. and Belongie, S. (2010). Word spotting in the wild. In Daniilidis, K., Maragos, P., and Paragios, N., editors, Computer Vision ECCV 2010, volume 6311 of Lecture Notes in Computer Science, pages 591- 604. Springer Berlin / Heidelberg.
  22. Weinman, J., Learned-Miller, E., and Hanson, A. (2009). Scene text recognition using similarity and a lexicon with sparse belief propagation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(10):1733 -1746.
  23. Wolf, C. and Jolion, J.-M. (2006). Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recognit., 8(4):280-296.
  24. Wright, S. L. (2002). Ibm 9.2-megapixel flat-panel display: Technology and infrastructure.
  25. Yamada, M., Budiarto, R., Endo, M., and Miyazaki, S. (2004). Comic image decomposition for reading comics on cellular phones. IEICE Transactions, 87- D(6):1370-1376.

Paper Citation

in Harvard Style

Rigaud C., Karatzas D., Weijer J., Burie J. and Ogier J. (2013). Automatic Text Localisation in Scanned Comic Books . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 814-819. DOI: 10.5220/0004301308140819

in Bibtex Style

author={Christophe Rigaud and Dimosthenis Karatzas and Joost van de Weijer and Jean Christophe Burie and Jean-Marc Ogier},
title={Automatic Text Localisation in Scanned Comic Books},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},

in EndNote Style

JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI - Automatic Text Localisation in Scanned Comic Books
SN - 978-989-8565-47-1
AU - Rigaud C.
AU - Karatzas D.
AU - Weijer J.
AU - Burie J.
AU - Ogier J.
PY - 2013
SP - 814
EP - 819
DO - 10.5220/0004301308140819