Authors:
Dinesh Kumar
and
Dharmendra Sharma
Affiliation:
Faculty of Science & Technology, University of Canberra, 11 Kirinari Street, Canberra, ACT 2617, Australia
Keyword(s):
Distributed Information Integration, Central Processor, Local Processor, Convolutional Neural Network, Filter Pyramid, Scale-invariance.
Abstract:
A large body of physiological findings has suggested the vision system understands a scene in terms of its local features such as lines and curves. A highly notable computer algorithm developed that models such behaviour is the Convolutional Neural Network (CNN). Whilst recognising an object in various scales remains trivial for the human vision system, CNNs struggle to achieve the same behaviour. Recent physiological findings are suggesting two new paradigms. Firstly, the visual system uses both local and global features in its recognition function. Secondly, the brain uses a distributed processing architecture to learn information from multiple modalities. In this paper we combine these paradigms and propose a distributed information integration model called D-Net to improve scale-invariant classification of images. We use a CNN to extract local features and, inspired by Google’s INCEPTION model, develop a trainable method using filter pyramids to extract global features called Fil
ter Pyramid Convolutions (FPC). D-Net locally processes CNN and FPC features, fuses the outcomes and obtains a global estimate via the central processor. We test D-Net on classification of scaled images on benchmark datasets. Our results show D-Net’s potential effectiveness towards classification of scaled images.
(More)