In addition, image retrieval systems usually estab-
lish previously a given set of visual features which
are used for any application domain. Thus, for two
queries, it is possible to obtain completely different
precision values (a high precision of retrieval results
for one and a low value for the other), although the
retrieval process relies on the same features. In this
case, semantic information deduced from image an-
notation is ignored and retrieval is performed stati-
cally without taking into account the image specific
semantic content.
Furthermore, the selection of a set of visual fea-
tures for a query image that is guided by its semantic
content can be considered as an interesting idea.
Our purpose is first to provide a set of relevant
visual features for each query image, given its seman-
tic content. This enables to improved retrieval results
accuracy, to substantially reduce the feature vectors
number and to decrease the processing time. So, the
aims of this paper can be summarized in:
• Building visual features collection according to
semantic content. These collections allow asso-
ciating, for a concept, a set of suitable visual fea-
tures that should be applied on queries containing
this concept. Suitability is deduced from the dif-
ferent application domains reviewed in literature
works.
• Integrating relevant visual feature selection dur-
ing the retrieval phase thus allowing a dynamic
retrieval process based on query image semantic
content. The feature selection uses image anno-
tation to select visual feature collection and then
extract the most relevant features.
3 RELEVANT VISUAL
FEATURES’ SELECTION FOR
IMAGE RETRIEVAL
The proposed approach is part of a research line with
wider scope, in which a hybrid retrieval approach has
been defined (Allani et al., 2014). Our idea is to use
both visual features and textual content in order to
perform a pattern-based retrieval. Our work has fo-
cused on structuring image dataset into a set of pat-
terns which are semantically and visually rich. Then
we retrieve results using similarity measure computed
between query similarity and the patterns.
Let’s consider a query image composed of the
”sky”, two ”divers”, ”ocean”, ”sand beach” and the
”clouds”. This image represents different objects and
so can be described using different visual features.
Whereas the ”sky”, the ”ocean” and the ”sand beach”
are characterized by their uniform texture, the ”diver”
and the ”clouds” are characterized by their specific
shape (shape of a person, shape of the clouds). More-
over, the image represents many meta-data character-
istics. We aim here to apply, during retrieval of sim-
ilar images, features suitable to the semantic content
and the meta-data of the image.
Shape features can be used to index images rep-
resenting shapes, same for texture. Also, meta-data
characteristics can be used to select appropriate vi-
sual features. For example a high resolution involves
a time consuming processing for feature extraction,
so features with high complexity (for example region
based shape features) should not be applied with such
images. As a result, for this specific image, using tex-
ture and shape features such as Edge Histogram and
canny edge detector which is a contour based shape
feature, can provide more relevant retrieval results be-
cause they are suitable to the query image.
The overall architecture of our approach is illus-
trated in Figure 1. The process begins with building
a set of visual features’ collections dedicated to spe-
cific concepts or meta-data characteristics. Building
process is performed given a set of annotated images
and a set of suitability rules deduced form literature.
Image dataset can be updated when new images are
added to it.
As depicted in Figure 1, our retrieval process,
based on relevant feature selection, is performed in
two phases: online and offline phases. In the next
paragraph the different steps of our retrieval approach
will be detailed.
The first step is to specify the set of candidate vi-
sual features on which selection mechanism will be
performed. Visual feature vectors are computed on
the image dataset and stored (cf. Figure 1 Step (1)).
They are then clustered into regions (cf. Figure 1 Step
(3)).
Concepts are then extracted from image annota-
tions. A disambiguation step, based on WordNet
1
, is
performed in order to retrieve the good synset (sense)
that corresponds to each word (cf. Figure 1 Step (2)).
The set of concepts associated to all of the image
dataset is stored. Finally, a unification step based on
WordNet and aiming to get common super-concepts
is performed. For example, two image annotations
containing the words ”Laguna Colorado” and ”Green
Lake” will be processed. The co-occurrences of the
two words are substituted by their lowest common an-
cestor which is ”lake”.
The previously described steps allow getting a vi-
sually and semantically indexed image dataset. Next
we define, given the semantic content of the dataset,
1
http://wordnet.princeton.edu/
ARelevantVisualFeatureSelectionApproachforImageRetrieval
379