with each run. Bitmap rotation by arbitrary angles
can then be implemented by the usual decomposition
of rotations into a sequence of horizontal and verti-
cal skew operations, using successive application of
transposition, line skewing, and transposition in or-
der to achieve skews perpendicular to the lines in the
run-length representation. We note that this method
differs substantially from previously published rota-
tion algorithms for run length encoded images (Zhu
et al., 1995; Au and Zhu, 2002).
Other Operations. Can be carried out quickly as
well on run-length representations:
• Run-length statistics are frequently used in docu-
ment analysis to estimate character stroke widths,
word spacings, and line spacings; they can be
computed in linear time for both black and white
runs by iterating through the runs of an image. In
the vertical direction, they can be computed by
first transposing the image.
• The line adjacency graph can be computed by
treating the runs as nodes in the graph and cre-
ating edges between any runs in adjacent lines if
the intervals represented by the runs overlap.
• Standard skeletonization methods for the line ad-
jaceny graph can be applied after computation of
the LAG as described above.
• Run-length based extraction of lines and circles
using the RAST algorithm (Keysers and Breuel,
2006) can be applied directly.
5 EXPERIMENTS
We have implemented, among others, conversions be-
tween run-length, packed bit, and unpacked bit rep-
resentations of binary images, transposition, all the
morphological operations with rectangular structur-
ing elements described above, bitmap rotation by ar-
bitrary angles, computation of run-length statistics,
connected component labeling, and bounding box
extraction. For evaluating the general behavior of
these algorithms and determining whether they are
feasible in practice, we are comparing the perfor-
mance of the run-length based algorithms with the
bitmap-based binary morphology implementation in
Leptonica, an open source morphological image pro-
cessing library in use in production code and con-
taining well-documented algorithms and implemen-
tations (Bloomberg, 2002; Bloomberg, 2007).
Leptonica contains multiple implementations of
binary morphology; the fastest general-purpose im-
plementation is pixErodeCompBrick (and analogous
Figure 3: A 7000 × 7000 image of a cadastral map used for
performance measurements.
names for other operations), a method that uses sepa-
rability and binary decomposition; it was used unless
otherwise stated. Leptonica also contains partially
evaluated and optimized binary morphology opera-
tors for a number of specific small mask sizes avail-
able under the names like pixErodeBrickDwa; these
were used in some experiments. We have verified that
the implementations give bit-identical results using a
large number of synthetic images and document im-
ages. Both libraries were compiled with their default
(optimized) settings.
Experiment 1. To gain some general insights into
the behavior of the run length methods for real-world
document images, the running times of morphologi-
cal operations on 245 images from the UW3 (Guyon
et al., 1997) database, 300 dpi binary images of scans
of degraded journal publication pages, were mea-
sured. The results are shown in Figure 2. We see that,
except for masks of size five or below, the run length
implementation outperforms the bit blit implementa-
tion.
By choosing at runtime between the bit blit im-
plementation and the run length implementation, we
can obtain a method that shares the characteristics of
both kinds of images. As already noted above, the
cross-over point can be determined automatically ei-
ther based on mask size and dpi, or based on output
complexity. This is shown as the bold curve in the
figures; the curve does not coincide the bit blit based
running times because the run length figures include
the conversion times from run length representations
to packed bit representations and back to run length
representations; in many applications, these conver-
BINARY MORPHOLOGY AND RELATED OPERATIONS ON RUN-LENGTH REPRESENTATIONS
163