THRESHOLD CORRECTION OF DOCUMENT IMAGE
BINARIZATION FOR TEXT EXTRACTION
Hiroshi Tanaka, Yusaku Fujii and Yoshinobu Hotta
Fujitsu, 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, Kanagawa 211-8588, Japan
Keywords: Adaptive binarization, Text extraction, Thresholding, Otsu binarization, Threshold correction, Background
noise, Niblack, Image resolution.
Abstract: In this paper, a simple threshold correction method for document image binarization for text extraction is
presented. This method enhances the binary image of characters, which is often adversely influenced by
neighboring strong pixels or background noise. The threshold correction method is based on a similar
method applied to ruled-line extraction presented by the author, and is claimed to be effective to text
extraction. The author also reveals the relationship between effectiveness of the method and the image
resolution.
1 INTRODUCTION
One of the most important objectives of document
image binarization is to extract text images from the
document background. In a simple document model,
each object is considered to be placed on the flat
surface of the document background. According to
this model, binarization can be considered as a two-
class discrimination problem for determining a
global threshold (Otsu, 1979). However,
complicated document images require adaptive
binarization, in which the local threshold is
calculated for each pixel. Such images have complex
designs, which cannot be expressed using two
classes; further, they could be severely degraded.
In the past, various adaptive binarization
methods have been proposed. Trier (Trier and Jain,
1995) compared several binarization methods on the
bases of thier character recognition accuracies and
concluded that Niblack’s method (Niblack, 1986)
yields the best result when the noise reduction
technique is applied. Sauvola (Sauvola et. al., 1997)
modified Niblack’s method using region analysis, in
which textual and nontextual regions were separated
from each other. Sauvola’s method has been the
most popular binarization method for document
images. These methods assume that pixels can be
classified into two classes among local neighbors.
In the recent years, we can also find a lot of
newly invented binarization methods that may
overcome some problems of conventional methods.
For example, DIBCO 2009, the Document Image
Binarization Contest held in ICDAR 2009, is a good
collection of the latest document binarization
methods (Gatos et. al., 2009). Although there are
great methods proposed in DIBCO 2009, most of
them focus on binarizing much degraded images
such as historical documents depending on the
image quality used in the contest (Fig. 1), and then
they require much computing cost.
Our document recognition system recognizes
binarized text images obtained by an adaptive
binarization method based on Niblack’s method
(Kamada and Fujimoto, 1999). As described later,
adaptive binarization methods, including those
developed by Niblack and Sauvola, have a problem.
Because these methods are based on the assumption
that local neighbors can be classified into two
classes, some pixels that have three or more pixel
classes in each local area are often dropped off. This
results in broken shapes of character images (Fig. 2)
and causes errors in character recognition. We solve
this problem by correcting the binarization threshold
with respect to the neighboring threshold surface.
This technique was once applied to ruled-line
extraction (Tanaka, 2009) and is also proved to be
effective to text extraction.
In Section 2, we describe the problems of
conventional methods and our solutions. In Section
3, we present experimental results. Finally in Section
4, we conclude the paper.
387
Tanaka H., Fujii Y. and Hotta Y..
THRESHOLD CORRECTION OF DOCUMENT IMAGE BINARIZATION FOR TEXT EXTRACTION.
DOI: 10.5220/0003396503870391
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP-2011), pages 387-391
ISBN: 978-989-8425-47-8
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)