has been acquired. This is useful in a entertainment context, in which one want to fill
his geo-located image database with non-tagged photos. Another context could be the
forensic one, where it results essential to constrain the possible zone in which a picture
has been taken.
A similar issue was faced few years ago, under the name of location recognition
task, as an open research contest
4
. There, contestants were given a collection of color
images taken by a calibrated digital camera. The photographs had been taken at various
locations taken in a small city neighborhood, often sharing overlapping fields of view or
certain objects in common. The GPS locations for a subset of the images are provided.
The goal of the contest was to guess, as accurately as possible, the GPS locations of
the unlabeled images. Essentially, all the proposed resolutive approaches were based
on the reconstruction of 3D scenes owing to the registration of several images with
overlapping fields of view. Inferences on the position of non geo-located test images
was inferred by taking into account that 3D model. An example of such framework is
proposed in [4].
In our situation, the task is much harder: heterogeneous pictures taken far from each
other, at a different time of the day, have to be managed. This is a difficult problem and,
to the best of our knowledge, no solutions are present nowadays. Due to the vastity
of the existent geographical varieties, it seems now reasonable to drop relying on the
geometric content encoded in the pictures, and to build a recognition technique based
on the 2D image pictorial features.
In this paper, we face the issues of the geo-clustering and geo-location recognition
of images, in the context of a large geo-located image database. We will show how us-
ing well-known techniques in the literature, such as the Probabilistic Latent Semantic
Analysis, Mean Shift Clustering and Support Vector Machine framework, strong and ef-
fective results can be achieved, proposing valuable solutions to the problems discussed
above.
The rest of the paper is organized as follows. In Sec. 2, mathematical background
notions are reported. Then, in Sec. 3, the outline of our system for geo-clustering and
geo-location recognition is detailed. Sec. 4 presents the experiments carried out on large
databases taken from Panoramio, and, finally, Sec. 5 concludes the paper, envisaging
future perspectives.
2 Mathematical Background
2.1 Probabilistic Latent Semantic Analysis
In this section, we briefly review the probabilistic Latent Semantic Analysis (pLSA),
in its adaption to visual data. We describe the model using the classical terminology
of the literature on text classification, in parallel to that regarding the image domain.
The input is a dataset of D documents (images), each containing local regions found
by interest operators, whose appearance has been quantized into W visual words [5].
Therefore, the dataset is encoded by a co-occurrence matrix of size W × D, where the
4
Where Am I? ICCV Computer Vision Contest, please see
http://research.microsoft.com/iccv2005/Contest/
9595