A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON
LOGISTIC REGRESSION AND OWA OPERATOR FOR
CONTENT-BASED IMAGE RETRIEVAL SYSTEM
P. Zuccarello, E. de Ves
Departamento de Informtica, University de Valencia, Avda. Vicente Andr
´
es Estell
´
es, 1. 46100-Burjasot, Valencia, Spain
T. Leon, G. Ayala
Department of Statistics and Operations Research, University of Valencia, Valencia, Spain
J. Domingo
Institut of Robotics, University of Valencia, Valencia, Spain
Keywords:
Visual information retrieval,relevance feedback,logistic regresion.
Abstract:
This paper presents a new algorithm for content based retrieval systems in large databases. The objective of
these systems is to find the images which are as similar as possible to a user query from those contained in the
global image database without using textual annotations attached to the images. The procedure proposed here
to address this problem is based on logistic regression model: the algorithm considers the probability of an
image to belong to the set of those desired by the user. In this work a relevance proabaility π(I) is a quantity
wich reflects the estimate of the relevance of the image I with respect to the user’s preferences. The problem of
the small sample size with respect to the number of features is solved by adjusting several partial linear models
and combining its relevance probabilitis by means of an ordered averaged weighted operator. Experimental
results are shown to evaluate the method on a large image database in term of the average number of iterations
needed to find a target image.
1 INTRODUCTION
The increasing amount of information available in to-
days world raises the need to retrieve relevant data
efficiently. Unlike text-based retrieval, where key
words are successfully used to index documents,
content-based image retrieval poses up-front the fun-
damental questions of how to extract useful image
features and how to use them for intuitive retrieval
(Smeulders et al., 2000). The main drawback of tex-
tual image retrieval systems, that is, the annotator de-
pendency, would be overcome in pure CBIR systems.
Image features are a key aspect of any CBIR sys-
tem. A general classification can be made: low level
features (color, texture and shape) and high level fea-
tures (usually obtained by combining low level fea-
tures in a reasonably predefined model). High level
features have a strong dependency on the application
domain, therefore they are not usually suitable for
general purpose systems. This is the reason why one
of the most important and developed research activi-
ties in this field has been the extraction of good low
level image descriptors. Obviously, there is an impor-
tant gap between these features and human perception
(a semantic gap). For this reason, different methods
(mostly iterative procedures) have been proposed to
deal with the semantic gap (Rui et al., 1998). In most
cases the idea underlying these methods is to integrate
the information provided by the user into the decision
process. This way, the user is in charge of guiding
the search by indicating his/her preferences, desires
and requirements to the system. The basic idea is
rather simple: the system displays a set of images
(resulting from a previous search); the user selects
the images that are relevant (desired images) and re-
jects those which are not (images to avoid) according
to his/her particular criterion; the system then learns
from these training examples to achieve an improved
performance in the next run. The process goes on it-
eratively until the user is satisfied. This kind of proce-
dures are called relevance feedback algorithms (Zhou
and Huang, 2003), (de Ves et al., 2006).
A query can be seen as an expression of an infor-
mation need to be satisfied. Any CBIR system aims
167
Zuccarello P., de Ves E., Leon T., Ayala G. and Domingo J. (2007).
A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON LOGISTIC REGRESSION AND OWA OPERATOR FOR CONTENT-BASED IMAGE
RETRIEVAL SYSTEM.
In Proceedings of the Second International Conference on Computer Vision Theory and Applications - IU/MTSV, pages 167-172
Copyright
c
SciTePress
at finding images relevant to a query and thus to the
information need expressed by the query. The rela-
tionship between any image in the database and a par-
ticular query can be expressed by a relevance value.
This relevance value relies on the user-perceived sat-
isfaction of his/her information need. The relevance
value can be interpreted as a mathematical probabil-
ity (a relevance probability). The notion of relevance
probability is not unique because different interpre-
tations have been given by different authors. In this
paper a relevance probability π(I) is a quantity which
reflects the estimation of the relevance of the image I
with respect to the user’s information needs. Initially,
every image in the database is equally likely, but as
more information on the user’s preferences becomes
available, the probability measure concentrates on a
subset of the database. The iterative relevance feed-
back scheme proposed in the present paper is based
on logistic regression analysis for ranking a set of im-
ages in decreasing order of their evaluated relevance
probabilities.
Logistic regression is based on the construction
of a linear model whose inputs, in our case, will be
the image characteristics extracted from a certain im-
age I and whose output is a function of the relevance
probability of the image in the query π(I). In logis-
tic regression analysis, one of the key features to be
established is the order of the model to be adjusted.
The order of the model must be in accordance with
the reasonable amount of feedback images requested
from the user. For example, it is not reasonable for
the user to select 40 images in each iteration; a feed-
back of 5/10 images would be acceptable. This re-
quirement leads us to group the image features into n
smaller subsets. The outcome of this strategy is that
n smaller regression models must be adjusted: each
sub-model will produce a different relevance proba-
bility π
k
(I) (k = 1 . . . n). We then face to the ques-
tion of how to combine the π
k
(I) in order to rank the
database according to the user’s preferences. OWA
(ordered weighted averaging) operators which were
introduced by Yager in 1988 (Yager, 1988) provides
a consistent and versatile way of aggregating multiple
inputs into one single output.
Section 2 explains the logistic regression approach
to the problem. Next, in section 3 the aggregation op-
erators used in our work are introduced. Section 4
describes the low level features extracted from the im-
ages and used to retrieve them. An crucial part of this
work, the proposed algorithm, is described in detail in
section 5. After that, in section 6 we present experi-
mental results which evaluate the performance of our
technique using real-world data. Finally, in section 7
we extract conclusions and point to further work.
2 LOGISTIC REGRESSION
MODEL
At each iteration, a sample is evaluated by the user
selecting two sets of images: the examples or posi-
tive images and the counter-examples or negative im-
ages. Let us consider the (random) variable Y giving
the user evaluation where Y = 1 means that the image
is positively evaluated and Y = 0 means a negative
evaluation.
Each image in the database has been previously
described by using low level features in such a way
that the j-th image has the k-dimensional feature vec-
tor x
j
associated. Our data will consist of (x
j
, y
j
),
with j = 1, . . . , k where x
j
is the feature vector and y
j
the user evaluation (1= positive and 0= negative). The
image feature vector x is known for any image and
we intend to predict the associated value of Y . The
natural framework for this problem is the generalized
linear model. In this paper, we have used a logistic
regression where P(Y = 1 | x) i.e. the probability that
Y = 1 (the user evaluates the image positively) given
the feature vector x, is related with the systematic part
of the model (a linear combination of the feature vec-
tor) by means of the logit function. Generalized lin-
ear models (GLMs) extend ordinary regression mod-
els to encompass non-normal response distributions
and modeling functions of the mean. Most statisti-
cal software has the facility to fit GLMs. Logistic
regression is the most important model for categor-
ical response data. Logistic regression models are
also called logit models. They have been successfully
used in many different areas including business appli-
cations and genetics. For a binary response variable
Y and p explanatory variables X
1
, . . . , X
p
, the model
for π(x) = P(Y = 1 | x) at values x = (x
1
, . . . , x
p
) of
predictors is
logit[π(x)] = α + β
1
x
1
+ . . . + β
p
x
p
(1)
where logit[π(x)] = ln
π(x)
1π(x)
. The model can also be
stated directly specifying π(x) as
π(x) =
exp(α + β
1
x
1
+ . . . + β
p
x
p
)
1 + exp(α + β
1
x
1
+ . . . + β
p
x
p
)
. (2)
The parameter β
i
refers to the effect of x
i
on the log
odds that Y = 1, controlling the other x
j
. The model
parameters are obtained by maximizing the likelihood
equations.
In the first steps of the procedure, we have a major
difficulty when having to adjust a global regression
model in which we take the whole set of variables into
account, because the number of images (the number
of positive plus negative images chosen by the user)
VISAPP 2007 - International Conference on Computer Vision Theory and Applications
168
is typically smaller than the number of characteris-
tics. In this case, the regression model adjusted has as
many parameters as the number of datum and many
relevant variables could be not considered. On the
other hand it is not realistic to ask the user to make a
great number of positive and negative selections from
the very beginning; therefore we think that the dif-
ficulty cannot be avoided in this way. In order to
solve this problem, our proposal is to adjust different
smaller regression models: each model considers only
a subset of variables consisting of semantically re-
lated characteristics of the image. Consequently, each
sub-model will associate a different relevance prob-
ability to a given image x, and we face the question
of how to combine them in order to rank the database
according to the user’s preferences. We can see this
question as an information fusion problem.
3 AGGREGATING THE
RELEVANCE PROBABILITIES
Let us denote as π
1
(x), π
2
(x), . . . , π
n
(x) the different
relevance probabilities associated with a given image
x. Each one of them has been obtained separately
by using different regression models and we need
to associate a final probability π(x) by aggregating
the information provided by each π
j
(x), ( j = 1 . . . n).
Mathematical aggregation operators transform a fi-
nite number of inputs into a single output and play
an important role in image retrieval. In (Stejic et al.,
2005)the authors compare the effect of 67 operators
applied to the problem of computing the overall im-
age similarity, given a collection of individual fea-
ture similarities. Their results show how important
for retrieval performance the choice of the aggrega-
tion operator is. We have not used any of the 67
operators reviewed. Instead, we decided to use the
so-called ordered weighted averaged (OWA) opera-
tors (Yager, 1988) since then they have been success-
fully applied in different areas such as decision mak-
ing, expert systems, neural networks, fuzzy systems
and control, etc. An OWA operator of dimension n is
a mapping f :
n
with an associated weighting
vector W = (w
1
, . . . , w
n
) such that
n
j=1
w
j
= 1 and
where f (a
1
, . . . , a
n
) =
n
j=1
w
j
b
j
where b
j
is the j-th
largest element of the collection of aggregated objects
a
1
, . . . , a
n
. The particular cases shown in table 1 can
better illustrate the idea underlying OWA operators.
Notice that no weight is associated with any par-
ticular input; instead, the relative magnitude of the in-
put decides which weight corresponds to each input.
In our application, the inputs are relevance probabil-
ities and this property is very interesting because we
Table 1: Illustrating examples of OWA aggregation values.
W f (a
1
, . . . , a
n
)
(1, 0, . . . , 0) max
i
a
i
(0, 0, . . . , 1) min
i
a
i
(
1
n
,
1
n
, . . . ,
1
n
)
1
n
n
j=i
a
i
.
do not know, a priori, which set of visual descriptors
will provide us with the best information.
As OWA operators are bounded by the max and
min operators, Yager introduced a measure called or-
ness to characterize the degree to which the aggrega-
tion is like an or (max) operation:
orness(W ) =
1
n 1
n
i=1
(n i)w
i
. (3)
This author also introduced the concept of disper-
sion or entropy associated with a weighting vector:
Disp(W ) =
n
i=1
w
i
lnw
i
. (4)
Disp(W ) tries to reflect how much of the information
in the arguments is used during an aggregation based
on W .
Clearly, the vector of weights W can be pre-fixed,
but a number of approaches have also been sug-
gested for determining it according to different cri-
teria. One of the first methods developed was pro-
posed by O’Hagan (O’Hagan, 1988). It provides us
with the vector of weights for a given level of orness
(optimism) which maximizes their entropy:
W = argmax
n
i=1
w
i
lnw
i
subject to
α =
1
n1
n
i=1
(n i)w
i
,
n
i=1
w
i
= 1, w
i
[0, 1].
This problem is not computationally easy to solve.
Fuller and Majlender (Fuller and Majlender, 2003)
have obtained the analytical expression of the maxi-
mum entropy weights.
Figure 1 shows the aggregation of weights for
n = 10 obtained with the above-mentioned method
for orness value α [0.3, 0.7]. In this work, the ag-
gregation weights have been computed by using this
method.
4 VISUAL FEATURES
This section deals with the low level features the sys-
tem uses for predicting human judgment of image
A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON LOGISTIC REGRESSION AND OWA OPERATOR
FOR CONTENT-BASED IMAGE RETRIEVAL SYSTEM
169
is precisely to capture that notion of similarity that
each user has, which can also change between differ-
ent queries. Consequently, the valid criterion of sim-
ilarity appears to be the user’s opinion. This would
have introduced an external variable into the experi-
ment that would have masked the main goal: an ob-
jective evaluation of the system as such. That is why
we have chosen to use an approach in which a given
image has to be found. The search is considered suc-
cessful if the image is ranked within the first 16. This
number is arbitrary but we have checked that 16 im-
ages shown side by side is a reasonable number to
localize a particular one at a first sight.
Once the criterion for termination has been
adopted, the experiment will be designed by showing
several images to the user; a choice of 6 images (the
same for all users) was selected from a database of
about 4700. These images are classified as belonging
to different themes such as flowers, horses, paintings,
skies, textures, ceramic tiles, buildings, clouds, trees,
etc. even though the category is not used at all during
the search. The 6 target images are in our experience,
representative of different themes and levels of diffi-
culty. They are displayed in figure 2.
Figure 2: Target images used in experiments.
For each target image the search proceeds itera-
tively. In each iteration the user has to select some rel-
evant images (similar to the target according to his/her
judgment) and others significantly different from the
target. The number of images of each type is left to
the user, although two conditions must be fulfilled: at
least one relevant and one irrelevant images must be
selected and the total number of selections has to be
greater than 4. The algorithm proceeds as explained
in previous sections and the images are ranked. If the
target appears in the first 16, it is considered to have
been found; otherwise the user can move backwards
or forwards to see more images in rank order and a
new iteration of choosing/search/showing begins.
To ensure that the experiments are not biased, the
query tasks were performed by a group of 40 users
who had not been involved in the design and devel-
opment of the system and had no knowledge of the
content of the database or of the retrieval features and
Table 2: Average, maximun, minimun iteration number to
find a target image.
Image It. Av. max min
Car 5.17(2.95) 12 1
Flower 4.17 (3.20) 17 1
Butterfly 4.71 (3.70) 19 1
firework 2.14 (1.81) 9 1
Miro 3.67 (1.55) 8 2
Glass 3.42 (1.52) 6 1
All 3.88(1.07) 19 1
methods used (untrained users).
Table 2 shows the average and standard deviation
of the number of iterations needed to find images by
these untrained users. The last row shows the aver-
age for all images and users. The experiments exhibit
good performance in finding a target image (3.88 iter-
ations in average) in the used database.
7 CONCLUSION
This paper addresses the problem of image retrieval
by means of an algorithm based on logistic regression.
The main advantage of the method is the facility of
incorporating the feedback of the user. Its main draw-
back is the lack of sufficient information (too small
sample) to fit the model, since the number of inputs
(image features) is usually high. This has been ad-
dressed by means of partial models that get the output
from each subset of the inputs. The problem of com-
bining the information of the different models, which
is a data fusion problem, is solved by using an ordered
weighted averaging (OWA) operator.
Concerning the experimental results, the average
number of iterations shown in 2 exhibits good perfor-
mance of the procedure. Some further experimenta-
tion and results analysis is currently being carried out
by our research group, where users are grouped and
classified with regard to there interaction of the itera-
tive process of image selection.
REFERENCES
de Ves, E., Domingo, J., Ayala, G., and Zuccarello, P.
(2006). A novel bayesian framework for relevance
feedback in image content-based retrieval systems.
Pattern Recognition, 39:1622–1632.
Fuller, R. and Majlender, P. (2003). On obtaining minimal
variability owa operator weights. Fuzzy Sets and Sys-
tems, 136:203–215.
O’Hagan, M. (1988). Aggregating template or rule an-
tecedents in real-time expert systems with fuzzy set
A NOVEL RELEVANCE FEEDBACK PROCEDURE BASED ON LOGISTIC REGRESSION AND OWA OPERATOR
FOR CONTENT-BASED IMAGE RETRIEVAL SYSTEM
171
logic. In Proc. of 22nd Annu. IEEE Asilomar Conf. on
Signals, pages 681–689, Pacific Grove, CA.
Rui, Y., Huang, S., Ortega, M., and Mehrotra, S. (1998).
Relevance feeback: a power tool for interactive
content-based image retrieval. IEEE Transaction on
circuits and video technology, 8(5).
Smeulders, A., Santini, S., Gupta, A., and Jain, R. (2000).
Content-based image retrieval at the end of the early
years. IEEE transactions on Pattern Analysis and Ma-
chine Intellingence, 22(12):1349–1379.
Stejic, Z., Takama, Y., and Hirota, K. (2005). Mathemati-
cal aggregation operators in image retrieval: effect on
retrieval performance and role in relevance feedback.
Signal processing, 85:1297–324.
Yager, R. (1988). On ordered weighted averaging aggrega-
tion operators in multi-criteria decision making. IEEE
Trans. Systems Man Cybernet, 18:183–190.
Zhou, X. and Huang, T. (2003). Relevance feedback for
image retrieval: a comprehensive review. Multimedia
systems, 8(6):536–544.
VISAPP 2007 - International Conference on Computer Vision Theory and Applications
172