3 IMPLEMENTATION
Our implementation reads training images and gen-
erates feature vectors from them. Our features are
the Histogram of Gradients (HOG) (Dalal and Triggs,
2005), the MPEG-7 Edge Histogram (EHD) and the
Scalable Color Descriptor (SCD) (Sikora, 2001).
HOG divides images into parts and creates a his-
togram for each part. The histograms describe the
gradient orientations of each part which can be calcu-
lated from the horizontal and vertical gradients. For
EHD and SCD, the MPEG-7 reference implementa-
tion was used.
The aim of this work is not to come up with a new,
high-performing image detection algorithm. Rather,
the effect of using a DPM for measuring distance
in existing algorithms is examined. We selected the
HOG algorithm because it is representative for a line
of research called gradient-based algorithms. Empiri-
cal research suggests to combine HOG with other fea-
ture extraction methods to obtain a stronger algorithm
(Dollar et al., 2012, p. 10).
SCD and EHD from MPEG-7 provide this addi-
tional information to our implementation. Further-
more, turning these descriptor values into predicate
based data is straightforward. SCD and EHD do not
return predicates (i.e. zero/one values). To be able
to work with predicates, the values returned by SCD
and EHD are put into evenly sized bins by Algorithm
1. Note that the algorithm does something different
than creating a histogram. The output is an array of
values that are either zero or one.
Input: binSize size of one bin, minT smallest
possible value, binNum number of bins
to create, values array of values to be
binned
Output: binnedValues array of binned values
for i=0; i < values.size(); ++i do
for b = 0; b < binNum; ++b do
val = values.at(i);
if val ≥ minT + b · binSize & val <
minT + (b + 1) · binSize then
binnedValues[i · binNum + b]=1;
end
end
end
Algorithm 1: Transformation of Quantitative Val-
ues into Predicates.
HOG, in the version we use, is not scale-invariant,
while EHD and SCD are scale-invariant. Therefore
we have to resize our images to the correct size. Af-
ter this step, feature vectors are constructed in such
a way that the first N elements should be treated as
predicates, the remaining ones as distance measure-
ments. We train a modified Support Vector Machine
(SVM, (Joachims, 1998)) with the generated feature
vectors using a DPM kernel. This results in a SVM
model file containing the support vectors that create
an optimal separation of the training data.
The pedestrian detection part extracts the feature
vectors from the training set and uses the trained
SVM model to classify them. We use one of the
most straightforward methods for evaluating classi-
fier quality: the correct classification rate. More ad-
vanced evaluation measures (precision, recall, . . . )
exist. Their analysis was out of scope for this paper.
DPMs stipulate the use of quantitative and
predicate-based measures to represent taxonomic and
thematic thinking. We do not mandate which type
of measure to use for which type of thinking. We
can combine a quantitative measure for taxonomic
thinking with a predicate-based measure for thematic
thinking or we can use only quantitative or only
predicate-based measures.
4 TEST ENVIRONMENT
We selected the INRIA dataset
1
with upright images
of persons in everyday situations. The dataset is 970
MB large and contains thousands of images. Example
images are shown in Figures 3a and 3b.
Training was performed with 140 positive and 160
negative samples, testing with 50 positive and 50 neg-
ative images. During SVM training, the number of
allowed iterations without progress was restricted to
3000. We performed a manual classification into the-
matic and taxonomic measures. If there is a contrast
(i.e. x − y ,
x
y
,
a
.
,
.
−a
,
.
b
or
.
c
), then a measure is the-
matic and belongs on the right-hand side of Equation
1. Otherwise, it is taxonomic and belongs on the left-
hand side.
The importance of taxonomic thinking was set to
α =
1
2
during all experiments. This means we simu-
late a person that values taxonomic thinking as much
as thematic thinking. We ran pedestrian detection
with the described dataset for all combinations of
quantitative/predicate based measure/generalisation
function. In order to restrict the search space, only
predicate-based and quantitative measures were used
that were part of a purely predicate-based or purely
quantitative DPM that performed as good as the linear
kernel. To be able to compare our DPMs to the cur-
1
http://pascal.inrialpes.fr/data/human (last accessed
2015-02-24)
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
150