Table 4: Accuracy of the four models trained on the large
chairlift dataset.
chair. Our Simp. ResNet VGG
Siam. Class. 50 16
net.
S
1
99.67 98.78 100.0 100.0
S
2
99.12 95.77 99.71 100.0
S
3
99.11 98.60 100.0 99.60
S
4
99.49 96.32 100.0 100.0
S
5
99.39 98.18 100.0 100.0
S
6
99.66 99.32 100.0 100.0
S
7
99.26 95.31 100.0 99.88
S
8
99.46 97.82 99.98 99.67
S
9
100.0 98.19 100.0 100.0
S
10
99.75 98.59 100.0 100.0
S
11
98.84 97.30 100.0 100.0
S
12
100.0 100.0 100.0 100.0
S
13
99.48 98.18 99.22 100.0
S
14
100.0 99.88 99.89 100.0
S
15
99.89 98.56 99.78 99.86
S
16
98.98 94.77 99.27 99.13
S
17
98.33 97.80 99.26 98.51
S
18
99.81 98.27 100.0 100.0
S
19
99.59 99.80 100.0 100.0
S
20
98.83 97.35 99.58 99.94
Av. 99.44 97.71 99.98 99.76
from each chairlift. Indeed, a single siamese network
trained on 20 different chairlifts provides very good
results on each of these chairlift. Furthermore, when
the training set is large enough, our small siamese net-
work provides as good results as much deeper net-
works such as VGG16 or ResNet50. Future works
will consist in assessing the generalization ability of
our approach by testing our siamese network on new
unseen chairlift with different 3D geometries.
REFERENCES
Chen, W., Xie, D., Zhang, Y., and Pu, S. (2019). All you
need is a few shifts: Designing efficient convolutional
neural networks for image classification. In The IEEE
Conference on Computer Vision and Pattern Recogni-
tion (CVPR).
Chennupati, S., Sistu, G., Yogamani, S., and Rawashdeh, S.
(2019). Auxnet: Auxiliary tasks enhanced semantic
segmentation for automated driving. In International
Conference on Computer Vision Theory and Applica-
tions (VISAPP).
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei,
L. (2009). Imagenet: A large-scale hierarchical image
database. In The IEEE Conference on Computer Vi-
sion and Pattern Recognition (CVPR).
En, S., Lechervy, A., and Jurie, F. (2018). Ts-net: combin-
ing modality specific and common features for mul-
timodal patch matching. In 2018 IEEE International
Conference on Image Processing (ICIP). Ieee.
Hadsell, R., Chopra, S., and LeCun, Y. (2006). Dimension-
ality reduction by learning an invariant mapping. In
2006 IEEE Conference on Computer Vision and Pat-
tern Recognition (CVPR’06), volume 2, pages 1735–
1742. IEEE.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In The IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR).
Kolesnikov, A., Zhai, X., and Beyer, L. (2019). Revisit-
ing self-supervised visual representation learning. In
2019 IEEE conference on computer vision and pattern
recognition (CVPR). Ieee.
Lee, W., Na, J., and Kim, G. (2019). Multi-task self-
supervised object detection via recycling of bounding
box annotations. In 2019 IEEE conference on com-
puter vision and pattern recognition (CVPR). Ieee.
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., and Yan,
J. (2019). Siamrpn++: Evolution of siamese visual
tracking with very deep networks. In The IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR).
Mrquez-Neila, P., Salzmann, M., and Fua, P. (2017). Impos-
ing hard constraints on deep networks: Promises and
limitations. In CVPR Workshop on Negative Results
in Computer Vision.
Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P.,
and Moreno-Noguer, F. (2015). Discriminative learn-
ing of deep convolutional feature point descriptors. In
Proceedings of the IEEE International Conference on
Computer Vision, pages 118–126.
Simonyan, K. and Zisserman, A. (2015). Very deep con-
volutional networks for large-scale image recognition.
In International Conference on Learning Representa-
tions.
Song, C., Huang, Y., Ouyang, W., and Wang, L. (2018).
Mask-guided contrastive attention model for person
re-identification. In 2018 IEEE conference on com-
puter vision and pattern recognition (CVPR). Ieee.
Zagoruyko, S. and Komodakis, N. (2015). Learning to com-
pare image patches via convolutional neural networks.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pages 4353–4361.
Zhou, X., Huang, Q., Sun, X., Xue, X., and Wei, Y. (2017).
Towards 3d human pose estimation in the wild: A
weakly-supervised approach. In 2017 IEEE Interna-
tional Conference on Computer Vision (ICCV), pages
398–407.
Mask-guided Image Classification with Siamese Networks
543