chitectures are failing to capture the local properties.
This figure also shows that there is no best architec-
ture for the all the object classes and the choice of the
architecture is completely object dependent. This can
be seen in the likelihoods obtained from the plane and
the cow classes, where the base likelihoods are similar
but the responses obtained from the different architec-
tures are different.
7 CONCLUSIONS AND FUTURE
WORKS
The main objective of this work has been to investi-
gate the improvement in discriminability obtained by
substituting simple local features with local adaptive
composite hierarchical structures that are computed at
recognition time from a set of potential structures de-
noted as ”cloud features”. This is motivated by the
fact that even at local feature level, intra class object
variation is very large, implying that generic single
feature classifiers that try to capture this variation will
be very difficult to design. In our approach this diffi-
culty is circumvented by the introduction of the cloud
features that capture the intra class variation an fea-
ture level. The price paid is of course a more com-
plex process for the extraction of local features that
are computed in an optimization process in order to
yield maximally efficient features. We believe how-
ever that this process can be made efficient by consid-
ering the dependencies and similarities between local
feature variations that are induced by the global intra
class object variation.
There are many ways to improve the performance
and accuracy of the cloud features and investigate
their applications. As mentioned in the text, coming
up with better optimization algorithms will decrease
the usage cost of these features. Meanwhile design-
ing algorithms for learning the architecture rather than
hard-coding them will increase the accuracy of these
features. As for the applications, these features can
be used in different object detection and recognition
platforms. A direct follow up of this work is using
these features to build more robust object detectors
for detecting object classes. Since the cloud features
are results of clustering process rather than discrimi-
native analysis, they can also be used in bag-of-words
models and will result in more discriminative words
and smoothed labeled regions.
ACKNOWLEDGEMENTS
This work was supported by The Swedish Foundation
for Strategic Research in the project Wearable Visual
Information Systems.
REFERENCES
Bosch, A., Zisserman, A., and Munoz, X. (2007). Repre-
senting shape with a spatial pyramid kernel. In CIVR,
pages 401–408. Association for Computing Machin-
ery.
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In CVPR, pages 886–
893.
Felzenszwalb, P., Girshick, R., Mcallester, D., and Ra-
manan, D. (2009). Object detection with discrimina-
tively trained part based models. Pattern Analysis and
Machine Intelligence, IEEE Transactions on.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and
Ramanan, D. (2010). Object detection with discrim-
inatively trained part-based models. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence,
32:1627–1645.
Felzenszwalb, P. F. and Huttenlocher, D. P. (2005). Pictorial
structures for object recognition. IJCV, 61:55–79.
Kumar, S. and Hebert, M. (2006). Discriminative random
fields. IJCV, 68:179–201.
Laptev, I. (2006). Improvements of object detection using
boosted histograms. In BMVC, pages 949–958.
Ling, H. and Soatto, S. (2007). Proximity distribution ker-
nels for geometric context in category recognition. In
ICCV.
Liu, D., Hua, G., Viola, P., and Chen, T. (2008). Integrated
feature selection and higher-order spatial feature ex-
traction for object categorization. CVPR, 0:1–8.
Lowe, D. G. (2003). Distinctive image features from scale-
invariant keypoints.
Morioka, N. and Satoh, S. (2010). Building compact local
pairwise codebook with joint feature space clustering.
In ECCV, page 14, Crete, Greece.
Savarese, S., Winn, J., and Criminisi, A. (2006). Discrimi-
native object class models of appearance and shape by
correlatons. CVPR, 2:2033–2040.
Varma, M. and Zisserman, A. (2002). Classifying images of
materials: Achieving viewpoint and illumination in-
dependence. In ECCV, pages 255–271, London, UK.
Springer-Verlag.
Varma, M. and Zisserman, A. (2003). Texture classification:
Are filter banks necessary. In CVPR.
Viola, P. and Jones, M. (2001). Rapid object detection using
a boosted cascade of simple features. CVPR, 1:511.
Winn, J., Criminisi, A., and Minka, T. (2005). Object cat-
egorization by learned universal visual dictionary. In
ICCV, ICCV ’05, pages 1800–1807, Washington, DC,
USA. IEEE Computer Society.
IMPROVING FEATURE LEVEL LIKELIHOODS USING CLOUD FEATURES
437