image block will be integrated to improve the overall
system detection performance.
REFERENCES
Anthimopoulos, M., Gatos, B., and Pratikakis, I. (2013).
Detection of artificial and scene text in images and
video frames. Pattern Anal. Appl., 16(3):431–446.
Chan, T. H., Jia, K., Gao, S., Lu, J., Zeng, Z., and Ma, Y.
(2014). Pcanet: A simple deep learning baseline for
image classification. IEEE Trans. on Image Process-
ing, 24(12):5017–5032.
Ciresan, D. C., Meier, U., Masci, J., Gambardella, L. M.,
and Schmidhuber, J. (2011). High performance neu-
ral networks for visual object classication. Technical
Report IDSIA-01-11.
Epshtein, B., Ofek, E., and Wexler, Y. (2010). Detect-
ing text in natural scenes with stroke width trans-
form. IEEE Computer Vision and Pattern Recogni-
tion, pages 2963–2970.
Gao, R., Uchida, S., Shahab, A., Shafait, F., and Frinken,
V. (2014). Visual saliency models for text detection in
real world. PLoS ONE, 9:114–539.
Garcia, C. and Apostolidis, X. (2000). Text detection and
segmentation in complex color images. In Proc. Int.
Conf. on Acoustics, Speech and Signal Processing,
pages 2326–2329.
Girshick, R., Donahue, J., Darrell, T., , and Malik, J. (2014).
Rich feature hierarchies for accurate object detection
and semantic segmentation. IEEE Computer Vision
and Pattern Recognition.
Gupta, A., Vedaldi, A., , and Zisserman, A. (2016). Syn-
thetic data for text localisation in natural image. IEEE
Computer Vision and Pattern Recognition.
He, T., Huang, W., Qiao, Y., and Yao, J. (2016). Text-
attentional convolutional neural network for scene
text detection. IEEE Trans. Image Processing, pages
2529–2541.
Huang, W., Lin, Z., Yang, J., and Wang, J. (2013). Text lo-
calization in natural images using stroke feature trans-
form and text covariance descriptors. IEEE Int. Conf.
on Computer Vision, pages 1241–1248.
Jaderberg, M., Vedaldi, A., and Zisserman, A. (2014). Deep
features for text spotting. European Conf. on Com-
puter Vision.
Krizhevsky, A., Sutskever, I., and Hinton, G. (2012). Im-
agenet classification with deep convolutional neural
networks. Neural Information Processing Systems.
Lee, J.-J., Lee, P.-H., Lee, S.-W., Yuille, A., and Koch, C.
(2011). Adaboost for text detection in natural scene.
In Int. Conf. on Document Analysis and Recognition,
pages 429–434.
Lienhart, R. and Wernicke, A. (2002). Localizing and seg-
menting text in images and videos. IEEE Trans. on
Circuits and Systems for Video Technology, 12:256–
268.
Neumann, L. and Matas, J. (2012). Real-time scene text
localization and recognition. IEEE Computer Vision
and Pattern Recognition, pages 3538–3545.
Neumann, L. and Matas, J. (2013). Scene text localization
and recognition with oriented stroke detection. IEEE
Int. Conf. on Computer Vision, pages 97–104.
Simonyan, K. and Zisserman, A. (2015). Very deep con-
volutional networks for large-scale image recognition.
Int. Conf. on Learning Representation.
Socher, R., Pennington, J., Huang, E., Ng, A., and Man-
ning, C. (2011). Semi-supervised recursive autoen-
coders for predicting sentiment distributions. In Conf.
on Empirical Methods in Natural Language Process-
ing, pages 151–161.
Srinivas, S., Sarvadevabhatla, R., Mopuri, K., Prabhu, N.,
Kruthiventi, S., and Radhakrishnan, V. (2016). A tax-
onomy of deep convolutional neural nets for computer
vision. Frontiers in Robotics and AI.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., and Reed, S.
(2015). Going deeper with convolutions. Computer
Vision and Pattern Recognition.
Walha, R., Drira, F., Lebourgeois, F., Garcia, C., and Alimi,
A. (2014). Sparse coding with a coupled dictionary
learning approach for textual image super-resolution.
Int. Conf. on Pattern Recognition, pages 4459–4464.
Walha, R., Drira, F., Lebourgeois, F., Garcia, C., and Al-
imi, A. (2015). Resolution enhancement of textual
images via multiple coupled dictionaries and adaptive
sparse representation selection. Int. Journal of Docu-
ment Analysis and Recognition, 18(1):87–107.
Wang, K., Babenko, B., and Belongie, S. (2011). End-to-
end scene text recognition. In Int. Conf. on Computer
Vision, pages 1457–1464.
Wang, T., Wu, D. J., Coates, A., and Ng, A. Y. (2012). End-
to-end text recognition with convolutional neural net-
works. Int. Conf. on Pattern Recognition, pages 3304–
3308.
Yang, J., Wright, J., Huang, T., and Ma, Y. (2010). Im-
age super-resolution via sparse representation. IEEE
Trans. Image Process, 19(11):2861–2873.
Ye, Q. and Doermann, D. (2015). Text detection and recog-
nition in imagery: A survey. IEEE Trans. on Pattern
Analysis and Machine Intelligence, 37(7):1480–1500.
Yi, C. and Tian, Y. (2011). string detection from natu-
ral scenes by structure-based partition and grouping.
IEEE Trans. on Image Processing, 20(9):2594–2605.
Zhong, Y., Karu, K., and Jain, A. (1995). Locating text
in complex color images. Pattern Recognition, pages
1523–1536.
VISAPP 2017 - International Conference on Computer Vision Theory and Applications
250