RERANKING WITH CONTEXTUAL DISSIMILARITY MEASURES
FROM REPRESENTATIONAL BREGMAN K-MEANS
Olivier Schwander and Frank Nielsen
´
Ecole Polytechnique, Palaiseau/Cachan, France
´
ENS Cachan, Cachan, France / Sony Computer Science Laboratories Inc, Tokyo, Japan
Keywords:
Image retrieval, Bregman divergences, Alpha divergences, Clustering, Reranking, Context.
Abstract:
We present a novel reranking framework for Content Based Image Retrieval (CBIR) systems based on con-
textual dissimilarity measures. Our work revisit and extend the method of Perronnin et al. (Perronnin et al.,
2009) which introduces a way to build contexts used in turn to design contextual dissimilarity measures for
reranking. Instead of using truncated rank lists from a CBIR engine as contexts, we rather use a clustering
algorithm to group similar images from the rank list. We introduce the representational Bregman divergences
and further generalize the Bregman k-means clustering by considering an embedding representation. These
representation functions allows one to interpret α-divergences/projections as Bregman divergences/projections
on α-representations. Finally, we validate our approach by presenting some experimental results on ranking
performances on the INRIA Holidays database.
1 INTRODUCTION
Our work is grounded in the field of Content Based
Image Retrieval (CBIR): given a query image, we
search similar images in a large dataset of images.
Results are displayed in the form of a rank list where
images are ordered with respect to their similarity to
the query image. Typical CBIR systems manipulate
databases of one million images or more (see (Douze
et al., 2009; J
´
egou et al., 2008; Sivic and Zisserman,
2003) for recent works and (Datta et al., 2008) for a
comprehensive survey of the field).
Contextual Similarity measures are a way to al-
gorithmically design new similarity measures tailored
to the datasets/queries. The word context may en-
compass different meanings in the literature. On
the one hand, it can refer to the transformation of a
classical divergence D(p, q) into a local divergence
D
0
(p, q) = δ(p)δ(q)D(p, q), where the local distance
between two points depends on the neighborhood of
these two points. This idea was in particular explored
in (J
´
egou et al., 2007), which uses a conformal defor-
mation of the geometry (Wu and Amari, 2002). On
the other hand, the notion of context can also refer to
a reranking stage with a similarity measure built on
the rank list returned by a CBIR system, as developed
in (Perronnin et al., 2009). The goal is not only to im-
prove the retrieval accuracy but also to get an ordering
that is close to the intent of the user.
Perronnin’s system et al. (Perronnin et al., 2009)
addresses this problem by building contexts and aver-
aging the distances obtained for each context. In this
case, contexts are defined as the centroids of truncated
rank lists of growing size. We propose to improve this
process by building contexts in a more meaningful
way: instead of taking the N nearest neighbors of the
query, we cluster the rank list and use the centroids of
the clusters as contexts in order to naturally take into
account the semantic of the rank list. We then use an
averaging process to get a unique similarity score to
rerank image matching scores.
Instead of using a classical k-means cluster-
ing algorithm based on the squared Euclidean dis-
tance, we rather introduce a modified clustering al-
gorithm based on α-divergences (see Amari (Amari,
2007; Amari and Nagaoka, 2007)). The family
of information-theoretic α-divergences are provably
more suited to handle histogram distributions at the
core of many CBIR systems (e.g., bag of words). We
extend the Bregman k-means algorithm introduced by
Banerjee et al. (Banerjee et al., 2005; Nock et al.,
2008).
Finally, we evaluate our clustering and reranking
framework on the INRIA holidays dataset (J
´
egou
et al., 2008) based on the novel contextual similarity
measures.
118
Schwander O. and Nielsen F. (2010).
RERANKING WITH CONTEXTUAL DISSIMILARITY MEASURES FROM REPRESENTATIONAL BREGMAN K-MEANS.
In Proceedings of the International Conference on Computer Vision Theory and Applications, pages 118-123
DOI: 10.5220/0002842901180123
Copyright
c
SciTePress