Initialization Framework for Latent Variable Models
Heydar Maboudi Afkham, Carl Henrik Ek and Stefan Carlsson
Computer Vision and Active Preception Lab., KTH, Stockholm, Sweden
Keywords:
Latent Variable Models, Clustering, Classification, Localization.
Abstract:
In this paper, we discuss the properties of a class of latent variable models that assumes each labeled sample is
associated with set of different features, with no prior knowledge of which feature is the most relevant feature
to be used. Deformable-Part Models (DPM) can be seen as good example of such models. While Latent
SVM framework (LSVM) has proven to be an efficient tool for solving these models, we will argue that the
solution found by this tool is very sensitive to the initialization. To decrease this dependency, we propose a
novel clustering procedure, for these problems, to find cluster centers that are shared by several sample sets
while ignoring the rest of the cluster centers. As we will show, these cluster centers will provide a robust
initialization for the LSVM framework.
1 INTRODUCTION
Latent variable models are known for their flexibility
in adapting to the variations of the data. In this paper,
we focus on a specific class of latent variable models
for discriminative learning. In these models, it is
assumed that a set of feature is associated with each
labeled sample and the role of the latent variable
is select a feature from this set to be used in the
calculations. In both training and testing stages, these
models do not assume that a prior knowledge of
which feature to be used is available. Deformable
Part Models (DPM) (Felzenszwalb et al., 2010;
Felzenszwalb and Huttenlocher, 2005) can be seen
as a good example of these models. With the aid of
Latent SVM framework (LSVM), DPM provides a
level of freedom for samples, in terms of relocatable
structures, to adapt to the intra-class variation. As the
result of this flexibility, the appearance of the samples
becomes more unified and the training framework
can learn a more robust classifier over the training
samples. A good example of the model discussed
in this paper can be found within the original DPM
framework. In their work (Felzenszwalb et al., 2010),
the method does not assume the ground truth bound-
ing boxes are perfectly aligned and leaves it to the
method to relocate the bounding boxes, to find a bet-
ter alignment between the samples and the location
of this alignment is considered as a latent variable. In
a more complex example (Yang et al., 2012; Kumar
et al., 2010), the task is to train an object detector
without having a prior knowledge of the location of
the object in the image and considering it as a latent
variable. Here, it is left to the learning framework
to both locate the object and train the detector for
finding it in the test images. Looking at the solutions
provided for these examples, we can see that they
are either guided by a high level of supervision, such
as considering the alignment to be close to the user
annotation (Felzenszwalb and Huttenlocher, 2005;
Azizpour and Laptev, 2012), or guided by the bias of
the dataset, such as considering the initial location
to be in the center of the image, in a dataset in
which most of the objects are already located at the
center of the images (Yang et al., 2012; Kumar et al.,
2010). In general, such weakly supervised learning
problems are considered to be among the hardest
problems in computer vision and to our knowledge
no successful general solutions has been proposed
for them. This is because with no prior knowledge of
how an object looks like and acknowledging the fact
that different image descriptors such as HOG (Dalal
and Triggs, 2005) and SIFT (Lowe, 2004) are not
accurate enough, finding the perfect correspondence
between the samples becomes a very challenging
problem.
In this paper, we address the problem of supervi-
sion in the mentioned models and ask the questions,
“Will the training framework still hold if no cue about
the object is given to the model?”, and if the model
doesn’t hold, “How can we formulate the desirable
solution and automatically push the latent variables
toward this solution?”. To answer these questions,
227
Maboudi Afkham H., Ek C. and Carlsson S..
Initialization Framework for Latent Variable Models.
DOI: 10.5220/0004826302270232
In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods (ICPRAM-2014), pages 227-232
ISBN: 978-989-758-018-5
Copyright
c
2014 SCITEPRESS (Science and Technology Publications, Lda.)