Table 3: Contingency table showing the distribution of the classification of zones of a particular type in percent. (The total
number of errors equals 201 within 13811 tests.) The labels M, L, T, A, D, H, R, S correspond to the types math, logo, text,
table, drawing, halftone, ruling, and speckles, respectively.
M L T A D H R S error [%] # samples
M 90.8 0.0 8.6 0.0 0.0 0.6 0.0 0.0 9.2 476
L 9.1 27.3 36.4 0.0 9.1 9.1 0.0 9.1 72.7 11
T 0.1 0.0 99.8 0.0 0.0 0.0 0.0 0.0 0.2 10450
A 0.8 0.0 20.7 68.6 9.9 0.8 0.0 0.0 31.4 121
D 1.5 0.3 3.0 5.5 86.0 3.5 0.0 0.3 14.0 401
H 0.0 0.9 0.0 0.0 9.7 86.7 0.9 1.8 13.3 113
R 0.4 0.0 1.3 0.0 0.4 0.0 96.1 2.2 3.9 232
S 0.1 0.0 0.5 0.0 0.1 0.1 0.0 99.4 0.6 2007
the need for features based on glyphs or the Fourier
transform. By employing a fast logistic (log-linear)
classifier trained using the maximum entropy crite-
rion on these features, we arrived at a fast and ac-
curate, yet easy to implement overall classifier with
a slightly higher error rate of 2.1%. In our experi-
ments we did not use context information as done in
(Wang et al., 2006) and thus could keep the decision
rule very simple. However, context models are likely
to help in the overall classification and an inclusion
of our approach into Wang et al.’s context model is
possible. Examining the errors made by the system
makes it seem likely that further improvements sig-
nificantly below the reached error rate may be difficult
to achieve without a significantly increased effort, for
example by using a dedicated sub-classifier to distin-
guish between text and table zones.
We wish to thank Oleg Nagaitsev for help with the im-
plementation and Thomas Deselaers for making avail-
able the open source image retrieval system FIRE,
which provided us with the implementation of some
of the features used. This work was partially funded
by the BMBF (German Federal Ministry of Education
and Research), project IPeT (01 IW D03).
