camera location; including people with luggage, pro-
jected shadows and lighting changes. Feature sets are
evaluated focusing on the recognition rate and the im-
provement introduced by shadow removal algorithms.
Data consists of blobs extracted from videos of
different lengths recorded by ourselves, together with
a bitmap crop of its bounding box from the real video
in RGB space and in grey levels. We extracted 4659
blobs, which were filtered by size (minimum size 250
pixels) removing up to 49% of the total. We then man-
ually labelled the remaining blobs, corresponding to
1371 images of class single people, 367 of class group
of people and 631 images of class luggage.
The criteria we followed to label blobs was:
• Every blob representing a person with or without
luggage is labelled as single person.
• Blobs representing an object are classified accord-
ing to the object as single person, group of people,
and luggage.
• At least 2/3 of the figure must be inside the
bounding box.
• A blob representing more than two people is con-
sidered to be a group of people, provided that at
least 2/3 of two people are visible. No matter how
many objects occluded may occur nor which kind
of objects are present.
Any other blob was not considered for further pro-
cessing.
For each image, we kept the result of applying to it
the shadow removal algorithms introduced in section
3 and calculated both feature sets discussed in sec-
tion 4 to the original image and the resulting images,
yielding 6 different data sets of 2369 images.
65
70
75
80
85
90
95
100
1 3 5 7 9 11 13 15
Classification success rate
Number of neighbours
with shadows
Shadows removed. Grays
Shadows removed. RGB
Figure 3: Classification rate of images with and without
shadows using geometric features.
We trained a k-nn classifier with different ran-
domly chosen images; the training set was con-
structed using 80% of the database and the test set was
composed of the other 20%. The experiments were
repeated 100 times to ensure the statistical indepen-
dence of the selected samples. The optimal number
65
70
75
80
85
90
95
100
1 3 5 7 9 11 13 15
Classification success rate
Number of neighbours
with shadows
Shadows removed. Grays
Shadows removed. RGB
Figure 4: Classification rate of images with, and without
shadows, using the matrix of foreground pixels density.
Grid size is 4× 4.
of regions to divide each blob into was found by di-
viding the blobs into different sets of 2×2, .., 10× 10
regions and classifying objects using a k-nn classifier.
A grid of size 4×4 was chosen because it is the small-
est with a good classification rate.
In figure 3 and 4 we show the global results of ob-
ject classification for different values of k and for each
feature set. In figure 3, we show the results using ge-
ometric features with shadows left and with shadows
removed. In figure 4, we show the same results for
also show results for foreground pixel density with
shadows and with shadows removed, using a grid size
of 4 × 4. It can be seen for geometric features that
for k = 1 the classification rate is 68%. For higher
values of k, the classification rate rises and stays be-
tween 87% and 90%. While foreground pixel density
behaves completely differently and shows a good suc-
cess rate (95%− 92%) for values of k between 1 and
5 and then decreases as k grows.
Table 1: Comparison of the confusion rates between groups
of people, person (P), or luggage (L) classes. Top rows cor-
respond to geometric features and bottom rows correspond
to foreground pixel density.
k 1 3 5 7 9 11 13
P geo. 36.25 48.39 48.39 50.73 51.69 54.51 54.51
L geo. 0.22 0.09 0.02 0.17 0.12 0.08 0.09
P den. 0.48 0.89 11.06 13.79 15.76 16.33 15.93
L den. 0.02 0.02 0.01 0.04 0.14 0.27 0.56
Figures 5 and 6 show performance of both fea-
ture sets for each class. Class person and luggage
have a good classification success rate in both cases;
but group of people class shows low values, worse
for geometric features than for foreground pixel den-
sity. Analyzing confusion between classes, we saw
that classes person and group of people are easily
confused, other interclass confusions are low. Both
feature sets are valid for classifying person and lug-
gage class with a good degree of accuracy. In table 1
rows indicate the percentage of confusion of samples
VISAPP 2008 - International Conference on Computer Vision Theory and Applications
664