(Figure 1). In some situations the image of a
personal document can be corrupt due to the
carelessness of the operator and other movements
during the scanning process (Figure 2). This can
happen despite the fact that dedicated document
scanners have very short scanning time – under one
second.
Figure 2: An example of corrupt scan
In the area of document inspection, many things
can be done with the use of document scanning.
Personal data can be automatically read using OCR,
the program can inspect the document security
features, personal photograph can be extracted etc.
The algorithms for these tasks may work better and
faster under some general prerequisites – that the
document be in horizontal orientation and cropped
from the unnecessary background, and in some cases
such prerequisites may even be required.
The algorithms described in this work focus on
these general prerequisites of document scanning.
The result of the algorithm should be a cropped
image of the personal document in horizontal
position suitable for further processing. An
important problem is also the detection of corrupt
scans. With this detection we can alert the operator
that another scan should be made and avoid
additional processing with useless results.
3 ALGORITHMS DESCRIBED
The sample image (Figure 1) shows that a good scan
consists of a very bright rectangular shape (the
personal document) and a much darker background.
The idea of our algorithms is based on this
observation. If we find the strongest edges that form
a rectangle in the image, it is very likely that these
are the edges of the document.
Corrupt document scans can be simply described
as images in which the strongest edges do not form a
rectangle, and we can estimate the corruption of the
scan using this information.
Two independent algorithms have been
developed for the problems mentioned. One is using
the Hough transformation (Hough, 1962, Duda and
Hart, 1972), which is a well-known method for edge
parameterization. The other relies on the computed
brightness gradient of the image.
3.1 Algorithm with the use of Hough
transformation
The scanned image is first downsampled to the size
that allows fast processing and is large enough for
accurate pinpointing of the personal document
borders. The simplest and fastest algorithm – nearest
neighbor – is sufficient for this operation.
The next step is edge enhancement with the use
of the Sobel filter that estimates the brightness
gradient at each image pixel. We only need the size
of the gradient and we discard the direction data.
Other edge finding algorithms could be used, but the
Sobel filter proved to be very satisfactory and it
incorporates enough smoothing to remove the need
of prior smoothing of the image.
Figure 3: An example of thresholded image of gradient
size (compare to figure 1)
The image of the size of the gradient is
thresholded and the resulting binary image of edges
(Figure 3) can be transformed with Hough
transformation. In our case the borders of the
document are represented as the local maxima in the
transform image (Figure 4).
)
()
>
mdHd
dHp
,;
),(
ϕ
(1)
If we take a closer look at vertical projection p of
the Hough transform H (to the
ϕ
axis) we can see
TWO SIMPLE ALGORITHMS FOR DOCUMENT IMAGE PREPROCESSING - Making a document scanning
application more user-friendly
117