algorithms. This shows that, in general, our method
can be applied to other databases with similar
features.
Figure 5: (left) Zooming into another sample document
from a different database, (centre) the bi-level image
generated by Silva-Lins-Rocha algorithm and (right) the
one produced by our method.
5 CONCLUSIONS
It was presented in this paper a method for image
binarization of historical documents. The method
uses two Tsallis entropy-based binarization
algorithms presented herein. The first algorithm
finds an initial cut-off value and every time this
value is greater than the most frequent colour of the
image (which is assumed to be a colour that belongs
to the paper not to the ink) the second algorithm is
executed generating a new threshold value achieving
a better quality bi-level image
Our method was evaluated analyzing precision,
recall, accuracy, specificity, PSNR and MSE in
comparison with a gold standard image generated
manually and the results of several other entropy-
based binarization algorithms. It reached the best
results for every one of these measures. It was also
applied to images from different data bases
generating also high quality images.
ACKNOWLEDGMENTS
This research has been partially supported by CNPq,
University of Pernambuco, Universidad Rey Juan
Carlos and Agencia Española de Cooperación
Internacional (AECI) under contract no. A/2948/05.
REFERENCES
PROHIST: http://recpad.dsc.upe.br/prohist
Bottou, L. et al., 1998. High Quality Document Image
Compression with DjVu. Journal of Electronic
Imaging, 410–425, SPIE (also: http://www.djvu.org).
Kapur, J.N., 1994. Measures of Information and their
Applications, J.Wiley & Sons.
Kapur, J.N., et al, 1985. A New Method for Gray-Level
Picture Thresholding using the Entropy of the
Histogram, Comp Vision, Graphics and Image Proc.,
Vol 29, no 3.
Li, C.H. and Lee, C.K., 1993. Minimum Cross Entropy
Thresholding, Pattern Recognition, vol. 26, no 4.
McMillan, N.A. and Creelman, C.D., 2005. Detection
Theory. LEA Publishing.
Oliveira, A.L.I., et al, 2006. Optical Digit Recognition for
Images of Handwritten Historical Documents,
Brazilian Symposium of Neural Networks, p.29,
Brazil.
Mello, C.A.B. et al., 2006. Image Thresholding of
Historical Documents: Application to the Joaquim
Nabuco's File, Eva Vienna, p. 115-122, Austria.
Mello, C.A.B. and Lins, R.D., 2000. Image Segmentation
of Historical Documents, Visual 2000, Mexico.
Parker, J.R., 1997. Algorithms for Image Processing and
Computer Vision. John Wiley & Sons.
Pun, T., 1981. Entropic Thresholding, A New Approach,
Computer Graphics and Image Processing, vol. 16.
Sahoo, P. et al., 1997. Threshold Selection using Renyi’s
Entropy, Pattern Recognition, vol. 30, no 1.
Sezgin, M., Sankur,B., 2004. Survey over image
thresholding techniques and quantitative performance
evaluation, J. of Electronic Imaging, no.13, vol 1, pp.
146-165.
Shannon, C., 1948. A Mathematical Theory of
Communication, Bell System Technology Journal, vol.
27, pp. 370-423, 623-656.
Silva,J.M., et al., 2006. Binarizing and filtering historical
documents with back-to-front interference,
Proceedings of the ACM SAC, France.
Tsallis, C., 1988. Possible Generalization of Boltzmann-
Gibbs statistics, J. of Statistical Physics, vol. 52, nos.
1-2, pp. 479-487.
Wu, L. et al., 1998. An Effective Entropic thresholding for
Ultrasonic Images, International Conference on
Pattern Recognition, pp 1552-1554, Australia.
Yan, L. et al., 2006. An Application of Tsallis Entropy
Minimum Difference on Image Segmentation, World
Congress on Intelligent Control and Automation, pp.
9557-9561, China.
HISTORICAL DOCUMENT IMAGE BINARIZATION
113