Authors:
Sébastien Paris
1
;
Xanadu Halkias
2
and
Hervé Glotin
3
Affiliations:
1
Aix-Marseille University, France
;
2
Université Sud Toulon-Var, France
;
3
Université Sud Toulon-Var and Institut Universitaire de France, France
Keyword(s):
Image Categorization, Scenes Categorization, Fine-grained Visual Categorization, Non-parametric Local Patterns, Multi-scale LBP/LTP, Dictionary Learning, Sparse Coding, LASSO, Max-pooling, SPM, Linear SVM.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Classification
;
Computer Vision, Visualization and Computer Graphics
;
Image Understanding
;
Information Retrieval and Learning
;
Object Recognition
;
Pattern Recognition
;
Software Engineering
;
Theory and Methods
Abstract:
In this paper, we address the general problem of image/object categorization with a novel approach referred to
as Bag-of-Scenes (BoS).Our approach is efficient for low semantic applications such as texture classification
as well as for higher semantic tasks such as natural scenes recognition or fine-grained visual categorization
(FGVC). It is based on the widely used combination of i) Sparse coding (Sc), ii) Max-pooling and iii) Spatial
Pyramid Matching (SPM) techniques applied to histograms of multi-scale Local Binary/Ternary Patterns
(LBP/LTP) and its improved variants. This approach can be considered as a two-layer hierarchical architecture:
the first layer encodes the local spatial patch structure via histograms of LBP/LTP while the second encodes
the relationships between pre-analyzed LBP/LTP-scenes/objects. Our method outperforms SIFT-based
approaches using Sc techniques and can be trained efficiently with a simple linear SVM.