Authors:
Ahmed J. Afifi
1
;
Olaf Hellwich
1
and
Toufique A. Soomro
2
Affiliations:
1
Technische Universität Berlin, Germany
;
2
Charles Sturt University, Australia
Keyword(s):
Convolutional Neural Networks (CNNs), Multi-task, Object Classification, Viewpoint Estimation, Synthetic Images.
Abstract:
Convolutional Neural Networks (CNNs) have shown an impressive performance in many computer vision
tasks. Most of the CNN architectures were proposed to solve a single task. This paper proposes a CNN
model to tackle the problem of object classification and viewpoint estimation simultaneously, where these
problems are opposite in terms of feature representation. While object classification task aims to learn
viewpoint invariant features, viewpoint estimation task requires features that capture the variations of the
viewpoint for the same object. This study addresses this problem by introducing a multi-task CNN
architecture that performs object classification and viewpoint estimation simultaneously. The first part of
the CNN is shared between the two tasks, and the second part is two subnetworks to solve each task
separately. Synthetic images are used to increase the training dataset to train the proposed model. To
evaluate our model, PASCAL3D+ dataset is used to test our proposed model
, as it is a challenging dataset
for object detection and viewpoint estimation. According to the results, the proposed model performs as a
multi-task model, where we can exploit the shared layers to feed their features for different tasks. Moreover,
3D models can be used to render images in different conditions to solve the lack of training data and to
enhance the training of the CNNs.
(More)