age as they retain a significant amount of informa-
tion about texture patterns and edges. The statistics
of spectral components have been used in the past pri-
marily in the context of texture analysis and synthe-
sis. In (Zhu et al., 1998), it is demonstrated that mar-
ginal distributions of spectral components suffice to
characterize homogeneous textures; other studies in-
clude (Portilla and Simoncelli, 2000) and (Wu et al.,
2000). To provide some preliminary evidence of the
discriminating power of spectral histogram (SH) fea-
tures, in Section 3, we report the results of a retrieval
experiment on a database of 1,000 images represent-
ing 10 different semantic categories. The relevance of
an image is determined by the nearest-neighbor cri-
terion applied to a number of SH-features combined
into a single vector. Even without a learning compo-
nent, we already obtain performances comparable to
those exhibited by many existing retrieval systems.
Learning techniques will be employed with a
twofold purpose: (a) to identify and split off the most
relevant factors of the SH-features for the discrimina-
tion of various categories of images; (b) to lower the
dimension of the representation to reduce complex-
ity and improve computational efficiency. We adopt
a learning strategy that will be referred to as Optimal
Factor Analysis (OFA) – a preliminary form of OFA
was introduced in (Liu and Mio, 2006) as Splitting
Factor Analysis. Given a (small) positive integer k,
the goal of OFA is to find an “optimal” k-dimensional
linear reduction of the original image features for a
particular categorization or indexing problem. Image
categorization and retrieval will be based on the near-
est neighbor classifier applied to the reduced features,
as explained in more detail below. We employ OFA
in the context of SH-features, but it will be presented
in a more general feature learning framework.
Image retrieval strategies employing a variety of
methods have been investigated in (Wang et al.,
2001), (Carson et al., 1999), (Rubner et al., 1997),
(Smith and Li, 1999), (Yin et al., 2005), (Hoi et al.,
2006). Further references can be found in these pa-
pers. Some of these proposals employ a relevance
feedback mechanism in an attempt to progressively
improve the quality of retrieval. Although not dis-
cussed in this paper, a feedback component can be
incorporated to the proposed strategy by gradually
adding to the training set images for which the quality
of retrieval was low.
A word about the organization of the paper. In
Section 2, we describe the histogram features that will
be used to characterize image content. Preliminary re-
trieval experiments using these features are described
in Section 3. Section 4 contains a discussion of Opti-
mal Factor Analysis, and Sections 5 and 6 are devoted
to applications of the machine learning methodology
to image categorization and retrieval. Section 7 closes
the discussion with a summary and a few remarks on
refinements of the proposed methods.
2 SPECTRAL HISTOGRAM
FEATURES
Let I be a gray-scale image and F a convolution filter.
The spectral component I
F
of I associated with F is
the image I
F
obtained through the convolution of I
and F, which is given at pixel location p by
I
F
(p) = F ∗ I(p) =
∑
q
F(q)I(p− q), (1)
where the summation is taken over all pixel locations.
For a color image, we apply the filter to its R,G,B
channels. For a given set of bins, which will be as-
sumed fixed throughout the paper, we let h(I,F) de-
note the corresponding histogram of I
F
. We refer to
h(I, F) as the spectral histogram (SH) feature of the
image I associated with the filter F. If the number of
bins is b, the SH-feature h(I,F) can be viewed as a
vector in R
b
. Figure 1 illustrates the process of ob-
taining SH-features. Frames (a) and (b) show a color
image and its red channel response to a Laplacian fil-
ter, respectively. The last panel shows the 11-bin his-
togram of the filtered image.
(a) (b)
(c)
Figure 1: (a) An image; (b) the red-channel response to a
Laplacian filter; (c) the associated 11-bin histogram.
If F = {F
1
,.. . ,F
r
} is a bank of filters, the SH-
features associated with the family F is the collec-
tion h(I,F
i
), 1 6 i 6 r, combined into the single m-
dimensional vector
h(I, F) = (h(I,F
1
),. .. ,h(I,F
r
)), (2)