the IEEE International Conference on Computer Vi-
sion, pages 1026–1034.
Huang, Y., Huang, K., Yu, Y., and Tan, T. (2011). Salient
coding for image classification. In Computer Vision
and Pattern Recognition (CVPR), 2011 IEEE Confer-
ence on, pages 1753–1760. IEEE.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J.,
Girshick, R., Guadarrama, S., and Darrell, T. (2014).
Caffe: Convolutional architecture for fast feature em-
bedding. In Proceedings of the 22nd ACM inter-
national conference on Multimedia, pages 675–678.
ACM.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. In Advances in neural information process-
ing systems, pages 1097–1105.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learn-
ing. Nature, 521(7553):436–444.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard,
R. E., Hubbard, W., and Jackel, L. D. (1989). Back-
propagation applied to handwritten zip code recogni-
tion. Neural computation, 1(4):541–551.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P.,
Ramanan, D., Doll
´
ar, P., and Zitnick, C. L. (2014).
Microsoft coco: Common objects in context. In Euro-
pean Conference on Computer Vision, pages 740–755.
Springer.
Liu, C., Sharan, L., Adelson, E. H., and Rosenholtz, R.
(2010). Exploring features in a bayesian framework
for material recognition. In Computer Vision and Pat-
tern Recognition (CVPR), 2010 IEEE Conference on,
pages 239–246. IEEE.
Nair, V. and Hinton, G. E. (2010). Rectified linear units
improve restricted boltzmann machines. In Proceed-
ings of the 27th International Conference on Machine
Learning (ICML-10), pages 807–814.
Oquab, M., Bottou, L., Laptev, I., and Sivic, J. (2014).
Learning and transferring mid-level image represen-
tations using convolutional neural networks. In Pro-
ceedings of the IEEE conference on computer vision
and pattern recognition, pages 1717–1724.
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus,
R., and LeCun, Y. (2013). Overfeat: Integrated recog-
nition, localization and detection using convolutional
networks. CoRR, abs/1312.6229.
Sharan, L., Rosenholtz, R., and Adelson, E. (2009). Mate-
rial perception: What can you see in a brief glance?
Journal of Vision, 9(8):784–784.
Sharif Razavian, A., Azizpour, H., Sullivan, J., and Carls-
son, S. (2014). Cnn features off-the-shelf: an as-
tounding baseline for recognition. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition Workshops, pages 806–813.
Simonyan, K. and Zisserman, A. (2014a). Two-stream con-
volutional networks for action recognition in videos.
In Advances in Neural Information Processing Sys-
tems, pages 568–576.
Simonyan, K. and Zisserman, A. (2014b). Very deep con-
volutional networks for large-scale image recognition.
CoRR, abs/1409.1556.
Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I.,
and Salakhutdinov, R. (2014). Dropout: a simple way
to prevent neural networks from overfitting. Journal
of Machine Learning Research, 15(1):1929–1958.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2015). Going deeper with convolutions.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pages 1–9.
Taigman, Y., Yang, M., Ranzato, M., and Wolf, L. (2014).
Deepface: Closing the gap to human-level perfor-
mance in face verification. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recogni-
tion, pages 1701–1708.
Tompson, J., Goroshin, R., Jain, A., LeCun, Y., and Bregler,
C. (2015). Efficient object localization using convo-
lutional networks. In Proceedings of the IEEE Con-
ference on Computer Vision and Pattern Recognition,
pages 648–656.
Torralba, A., Fergus, R., and Freeman, W. T. (2008). 80
million tiny images: A large data set for nonpara-
metric object and scene recognition. IEEE transac-
tions on pattern analysis and machine intelligence,
30(11):1958–1970.
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong,
Y. (2010). Locality-constrained linear coding for
image classification. In Computer Vision and Pat-
tern Recognition (CVPR), 2010 IEEE Conference on,
pages 3360–3367. IEEE.
Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba,
A. (2010). Sun database: Large-scale scene recogni-
tion from abbey to zoo. In Computer vision and pat-
tern recognition (CVPR), 2010 IEEE conference on,
pages 3485–3492. IEEE.
Yu, F., Zhang, Y., Song, S., Seff, A., and Xiao, J. (2015).
LSUN: construction of a large-scale image dataset us-
ing deep learning with humans in the loop. CoRR,
abs/1506.03365.
Zeiler, M. D. and Fergus, R. (2014). Visualizing and under-
standing convolutional networks. In European Con-
ference on Computer Vision, pages 818–833. Springer.
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., and Oliva,
A. (2014). Learning deep features for scene recog-
nition using places database. In Advances in neural
information processing systems, pages 487–495.
VISAPP 2017 - International Conference on Computer Vision Theory and Applications
296