An example is the case of medical environments (El-
Naqa et al., 2004), or text categorization (Dumais
et al., 1998). There are other various studies that clas-
sify images only by means of other types of image
features, such as color (Saber et al., 1996) or texture
features (Fern´andez et al., 2003). Some approaches
combine different features such as color and shapes,
e.g., proposals by (Forczmanski and Frejlichowski,
2008) and (Mehtre et al., 1998). However, it is less
common to find studies that focus on combining color,
texture and shape features for classification purposes.
When classification methods are applied to
general-purpose image collections the results are not
positive, even less so if we hope that the performance
of the classifier may match with the classification de-
veloped by non-expert humans. We find some exam-
ples in (Vailaya et al., 1999) and (Li and Wang, 2005).
This paper aims to observe the behavior of differ-
ent kinds of classifiers within a collection of general-
purpose images (photos). We thus describe a con-
trastive study between the groups made from these
mathematical classifiers and a prior classification per-
formed by humans.
To apply mathematical classifiers, it is necessary
that each image be represented by a feature vector,
i.e., each image is a point in a multidimensional space,
called the feature space. In narrow environments with
a defined purpose, feature extraction method are re-
stricted to those that highlight what is relevant and
necessary for the application. Because this paper fo-
cuses on natural images in a broad domain, we obtain
texture, color and shapes features. The reason is that
these require human knowledge in their perception.
2 MATERIALS AND METHODS
Our collection consists of more than 2000 images re-
trieved from Internet, all withdrawn from different
sites. With the aiding guidance of untrained users,
these images are grouped according to perceptual cri-
teria, that is, 10 groups are made, because this is what
seemed logical from the standpoint of the people in-
volved. In the initial distribution, the number of im-
ages within each class was not homogeneous. Yet, in
order to conduct a more rigorous testing, each class is
maintained with a total of 200 images. It is important
that the distribution of the samples be uniform across
all groups. The grouping is entirely based on the per-
ceptual criteria related to how content is valued by
humans. Thus, we have 2000 images classified into
10 classes, namely trees, people, cars, flowers, build-
ings, shapes, textures, animals, sunsets, and circles.
Some samples of each group are shown in Fig. 1, one
respectively per row.
2.1 Feature Extraction
Each image is represented by a feature vector of 110
features. This feature vector is divided into three
groups: the first 60 are labeled as color features, the
following 41 as texture features, and the last nine as
shape features.
2.1.1 Color Features
The color features used in this work are based on the
HLS model (Hue, Saturation, Luminosity), since hu-
man perception is quite similar to this model. We
are using color discretization (MacDonald and Luo,
2002) in 12 colors in Hue and in addition three other
colors, white, grey and black in the luminosity axis
(15 colors), indicating the ratio of pixels for each one.
On the other hand, local color features are used in or-
der to achieve information about the spatial distribu-
tion (Cinque et al., 1999): in particular the barycen-
ter of every 15 discrete colors with its coordinates
in the image (x,y), with 30 other features. Finally,
the standard deviation information from barycenter is
also computed, and therefore there are 15 additional
features. Summarizing, the total number of color fea-
tures are 60.
2.1.2 Texture Features
These have been obtained by applying two well
known methods. The first one works on a global pro-
cessing of images, it is based on the Gray Level Co-
ocurrence Matrix proposed by Haralick (Haralick and
Shapiro, 1993). This matrix is computed by count-
ing the number of times that each pair of gray levels
occurs at a given distance and for all directions. Fea-
tures obtained from this matrix are: energy, inertia,
contrast, inversedifference moment, and number non-
uniformity. The second method is focused to detect
only linear texture primitives. It is based on features
obtained from the Run Length Matrix proposed by
(Galloway, 1975), where a textural primitiveis a set of
consecutive pixels in the image having the same gray
level value. Four matrices, one for each direction, are
made, computed by counting all runs into the image.
Every item in these matrices indicates the number of
runs with the same length and gray level. There are
four matrices obtained from angles quantized to 45
o
intervals. One for horizontal runs (0
o
), one per ver-
tical runs (90
o
) and the other two for the two diag-
onals (45
o
and 135
o
). The features obtained from
these matrices are long run emphasis (LRE), short run
emphasis (SRE), gray level non-uniformity (GLNU),
BEHAVIOR OF DIFFERENT IMAGE CLASSIFIERS WITHIN A BROAD DOMAIN
279