detector (using the same set of images for training
and testing, the characteristics of the grid detected
and described by the SIFT descriptor achieved 10%
larger accuracy), we argue that not only the
distribution of distinctive features is important for
scene recognition, but also information about the
"intrusive" features of the detectors.
The SURF descriptor without orientation
information (U-SURF), worked better than the
classic SURF version of the BoW model. Using an
SURF detector with a U-SURF descriptor, an
average improvement of accuracy of 8.43%
accuracy over the classic SURF descriptor was
obtained. This confirms that specific character
orientation information is not required for the
recognition of the environment by this model, and it
only complicates the recognition process.
The Speed SURF detector with the U-SURF
descriptor operates faster (the image is encoded by
about 33% faster than when using the grid detector
with the SIFT descriptor with an average encoding
time of one image equal to 0.4 s), but a slightly
lower accuracy (83.51 ± 1.67%) has been obtained.
It has been noticed that the SURF descriptor
produces good results only by describing the
features detected by the SURF detector, while the
SIFT descriptor works well with various detectors.
Other combinations of detectors and descriptors
were not as effective as the latter; their accuracy
varied from 65% to 79.75% when performing
classification using 200 images of each category for
training. The algorithm has been tested with two
most effective detector and descriptor combinations
with indoor image images and reached an accuracy
of 55.85% - 58.16% by classifying images into five
categories of indoor environment. The shop's
environment was precisely distinguished, it was
correctly recognized on average 39 out of 50 images,
and the images of the bedroom, kitchen, living room
and office scenes were often mixed together. Having
tested the algorithm's performance with a data set
containing 15 outdoor and indoor categories, the
overall accuracy of 67.49 ± 1.50% was obtained.
Again, the indoor images were often mixed with
each other, but they were rarely blended with the
images of the outdoor environment categories.
We have noticed that the recognition and
separation of indoor scenes is more complicated,
because they are artificially created environments
that have plenty of inter-categorical similarities,
uniform shapes, repetitive objects, which results in
similar distinctive features in different categories of
images, which leads to inaccuracies of classification.
The type of the room could be determined more
precisely by finding specific objects in that room,
however, for a system based solely on the
distribution of distinctive features it is difficult to do.
The results of the research presented in this paper
could be used for researchers as well as practitioners
developing environment scene recognition systems
for blind and partially sighted people.
REFERENCES
Arthur, D., Vassilvitskii, S. 2007. K-Means: The
Advantages of Careful Seeding. Society for Industrial
and Applied Mathematics, pp. 1027-1035.
Bay, H., Tuytelaars, T., Van Gool, L. Surf: Speeded Up
Robust Features, Computer vision, ECCV 2006.
Springer, pp. 404-417. doi:
http://doi.org/10.1007/11744023_32
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L. 2008.
Speeded-Up Robust Features (SURF). Comput. Vis.
Image Underst. 110, 3, (June 2008), pp. 346-359.
Chan, L.A., Der, S.Z., Nasrabadi, N.M. 2002. Image
Recognition and Classification. Marcel Dekker, Inc.
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray,
C., 2004. Visual Categorization with Bags of
Keypoints. Workshop on Statistical Learning in
Computer Vision, ECCV Prague, pp. 1-22.
Dobre, C., Mavromoustakis, C., Garcia, N., Goleva, R.,
George Mastorakis, G., 2016. Ambient Assisted Living
and Enhanced Living Environments: Principles,
Technologies and Control (1st ed.). Butterworth-
Heinemann, Newton, MA, USA.
Dutt, B.S.R., Agrawal, P., Nayak, S. 2009. Scene
Classification in Images.
www.eecs.berkeley.edu/~pulkitag/scene_report.pdf
Ezaki, N., Bulacu, M., Schomaker, L., 2004. Text
Detection from Natural Scene Images: Towards a
System for Visually Impaired Persons. In 17th
International Conference on Pattern Recognition, vol.
2, 683-686. doi: 10.1109/ICPR.2004.1334351
Fei-Fei, L., Perona, P., 2005. A Bayesian Hierarchical
Model for Learning Natural Scene Categories. In 2005
IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, vol. 2, pp. 524-531.
doi: http://doi.org/10.1109/CVPR.2005.16
Gabryel, M., Capizzi, G. 2017. The Bag-of-Words Method
with Dictionary Analysis by Evolutionary Algorithm.
In 16th Int. Conference on Artificial Intelligence and
Soft Computing, ICAISC 2017, Part I. Lecture Notes in
Computer Science 10246, Springer, pp. 43-51.
Gabryel, M., Damasevicius, R. 2017. The Image
Classification with Different Types of Image Features.
In 16th Int. Conference on Artificial Intelligence and
Soft Computing, ICAISC 2017, Part I. Lecture Notes in
Computer Science 10245, pp. 497-506.
Lazebnik, S., Schmid, C., Ponce, J. 2006. Beyond Bags of
Features: Spatial Pyramid Matching for Recognizing