Authors:
Taurius Petraitis
1
;
Rytis Maskeliūnas
1
;
Robertas Damaševičius
1
;
Dawid Połap
2
;
Marcin Woźniak
2
and
Marcin Gabryel
3
Affiliations:
1
Kaunas University of Technology, Lithuania
;
2
Silesian University of Technology, Poland
;
3
Institute of Computational Intelligence and Czestochowa University of Technology, Poland
Keyword(s):
Object Recognition, Scene Recognition, Image Processing, Bag-of-Words, SIFT, SURF.
Abstract:
Object and scene recognition solutions have a wide application field from entertainment apps, and medical tools to security systems. In this paper, scene recognition methods and applications are analysed, and the Bag of Words (BoW), a local image feature based scene classification model is implemented. In the BoW model every picture is encoded by a bag of visual features, which shows the quantities of different visual features of an image, but disregards any spatial information. Five different feature detectors and two feature descriptors were analyzed and two best approaches were experimentally chosen as being most effective classifying images into eight outdoor categories: forced feature detection with a grid and description using SIFT descriptor, and feature detection with SURF and description with U-SURF. Support vector machines were used for classification. We also have found that for the task of scene recognition not just the distinct features which are found by common feature
detectors are important, but also the features that are uninteresting for them. Indoor scenes were experimentally classified into five categories and worse results were achieved. This shows that indoor scene classification is a much harder task and a model which does not take into account any mid-level scene information like objects of the scene is not sufficient for the task. A computer application was written in order to demonstrate the algorithm, which allows training new classifiers with different parameters and using the trained classifiers to predict the classes of new images.
(More)