cution. We report the mean classification rates of ten
independent runs where each run is carried out by fix-
ing the same total number of images considered in
Table 2 and by randomly shuffling the order of pres-
ence of the images in the process of constructing a
codebook. In the Aeroplane vs Horse example the
average size of codebook was 338±17 with a classifi-
cation rate of 0.88±0.01, whereas for the Diningtable
vs Pottedplant example the size of the codebook and
classification rate were 259±25 and 0.61±0.02, re-
spectively. While we have included the standard de-
viation for completeness, we noted that these are the
estimates of uncertainty for a very few trials. The
construction of a codebook using K-means algorithm
was performed in an average time of 16536 seconds,
while the proposed method required an average time
of 42 seconds only on a desktop computer with an In-
tel Core i5 running at 3.2GHz and 8GB of RAM.
7 DISCUSSION AND
CONCLUSION
This paper addresses the problem of object classifi-
cation of images together with a sequential learning
technique. Our system starts to progress in extracting
features from the training images using SIFT algo-
rithm. These features are converted into a codebook
using an extended RAC method. The codewords then
serve to construct a histogram for representing an im-
age. These histograms are then fed into a binary SVM
classifier to classify the objects. We construct the
codebook by sequentially processing images to retain
only the discriminative or rare features by allocating
new codewords using the extended RAC technique.
Our test results show that it is worth to select discrim-
inative features, instead of increasing the number of
training images, to yield better classification rate by
means of a compact codebook.
In the literature of BoF approach, the codeword
size is manually selected by the user and is com-
monly defined up to tens of thousands for ensuring
enough information encoding. However, such a huge
size of codewords causes an enormous computational
cost. To create a discriminative BoF representation,
we present a technique that well approximates the dis-
tribution of visual words in an image and the out-
put classifier accounting for class-specific discrimi-
nant features. Thus, this paper suggests an alterna-
tive view to the research community working with the
patch-based object recognition to emphasize the re-
taining of more discriminative descriptors rather than
the reminiscent of the BIG data hypothesis.
REFERENCES
Cortes, C. and Vapnik, V. (1995). Support-vector networks.
Machine learning, 20(3):273–297.
Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray,
C. (2004). Visual categorization with bags of key-
points. In Workshop on statistical learning in com-
puter vision, volume 1, pages 1–2.
Everingham, M., Eslami, S. M. A., Gool, L. V., Williams,
C. K. I., Winn, J., and Zisserman, A. (2010). The PAS-
CAL Visual Object Classes VOC Challenge. Interna-
tional Journal of Computer Vision (IJCV), 88(2):303–
338.
Karmakar, P., Teng, S. W., Lu, G., and Zhang, D. (2015).
Rotation invariant spatial pyramid matching for im-
age classification. In Proceedings of the International
Conference on Digital Image Computing: Techniques
and Applications (DICTA), pages 653–660.
Kim, S. (2011). Robust object categorization and segmenta-
tion motivated by visual contexts in the human visual
system. EURASIP Journal on Advances in Signal Pro-
cessing.
Kirishanthy, T. and Ramanan, A. (2015). Creating compact
and discriminative visual vocabularies using visual
bits. In International Conference on Digital Image
Computing: Techniques and Applications (DICTA),
pages 258–263.
Li, T., Mei, T., and Kweon, I. S. (2008). Learning optimal
compact codebook for efficient object categorization.
In IEEE Workshop on Applications of Computer Vi-
sion, pages 1–6.
Lowe, D. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60(2):91–110.
Ramanan, A. and Niranjan, M. (2010). A one-pass
resource-allocating codebook for patch-based visual
object recognition. In IEEE International Workshop
on Machine Learning for Signal Processing, pages
35–40.
Ramanan, A. and Niranjan, M. (2011). A review of code-
book models in patch-based visual object recogni-
tion. Journal of Signal Processing Systems, Springer,
68(3):333–352.
Ullman, S., Vidal-Naquet, M., and sali, E. (2002). Visual
features of intermediate complexity and their use in
classification. Nature neuroscience, 5(7):682–687.
Winn, J., Criminisi, A., and Minka, T. (2005). Object cat-
egorization by learned universal visual dictionary. In
IEEE International Conference on Computer Vision,
volume 2, pages 1800–1807.
Yang, L., Jin, R., Sukthankar, H., and Jurie, F. (2008). Uni-
fying discriminative visual codebook generation with
classifier training for object category recognition. In
proceeding of IEEE conference on Computer Vision
and Pattern Recognition (CVPR 2008), pages 1–8.
Zhu, X., Vondrick, C., Ramanan, D., and Fowlkes, C.
(2012). Do we need more training data or better mod-
els for object detection? In British Machine Vision
Conference (BMVC).
ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods
198