vide higher accuracy than the later convolutional lay-
ers deep in the network. Shallow models like AlexNet
can achieve high accuracy when the input image is
upsampled. In addition, we have used an anchor box
selection method and context window to further en-
hance car detection accuracy. We believe that our
findings will inspire the research community to eval-
uate shallow models for achieving high accuracy on
object detection tasks.
ACKNOWLEDGMENTS
Khalid Ashraf was supported by the National Sci-
ence Foundation under Award number 125127. We
thank Kostadin Ilov for many help with computational
hardware. Thanks to Ross Girshick for comments on
some initial results. Thanks to Fan Yang and Kaustav
Kundu for clarification on their results.
REFERENCES
Chen, C., Seff, A., Kornhauser, A., and Xiao, J. (2015a).
Deepdriving: Learning affordance for direct percep-
tion in autonomous driving. In CVPR.
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., and
Urtasun, R. (2016). Monocular 3d object detection
for autonomous driving. In CVPR.
Chen, X., Kundu, K., Zhu, Y., Berneshawi, A., Ma, H., Fi-
dler, S., and Urtasun, R. (2015b). 3d object proposals
for accurate object class detection. NIPS.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). ImageNet: A large-scale hierarchical
image database. In CVPR.
Erhan, D., Szegedy, C., Toshev, A., and Anguelov, D.
(2014). Scalable object detection using deep neural
networks. In CVPR.
Everingham, M., Gool, L. V., Williams, C. K. I., Winn,
J., and Zisserman, A. (2010). The pascal visual ob-
ject classes (voc) challenge. International Journal of
Computer Vision (IJCV).
Felzenszwalb, P., Girshick, R., McAllester, D., and Ra-
manan, D. (2010). Object Detection with Discrimi-
natively Trained Part Based Models. PAMI.
Geiger, A., Lenz, P., and Urtasun, R. (2012). Are we ready
for autonomous driving? the kitti vision benchmark
suite. In CVPR.
Girshick, R. (2015). Fast r-cnn. In ICCV.
Girshick, R. B., Donahue, J., Darrell, T., and Malik, J.
(2014). Rich feature hierarchies for accurate object
detection and semantic segmentation. In CVPR.
He, K., Zhang, X., Ren, S., and Sun, J. (2014). Spatial
pyramid pooling in deep convolutional networks for
visual recognition. arXiv:1406.4729.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep resid-
ual learning for image recognition. arXiv:1512.03385.
Hillel, A. B., Lerner, R., Levi, D., and Raz, G. (2012). Re-
cent progress in road and lane detection: a survey. Ma-
chine Vision and Applications.
Huval, B., Wang, T., Tandon, S., Kiske, J., Song, W.,
Pazhayampallil, J., Andriluka, M., Rajpurkar, P.,
Migimatsu, T., Cheng-Yue, R., Mujica, F., Coates,
A., and Ng, A. Y. (2015). An empirical evaluation
of deep learning on highway driving. arXiv preprint
arXiv:1504.01716v3.
Iandola, F. N., Moskewicz, M. W., Ashraf, K., Han, S.,
Dally, W. J., and Keutzer, K. (2016). Squeezenet:
Alexnet-level accuracy with 50x fewer parameters and
<1mb model size. arXiv:1602.07360.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
ageNet Classification with Deep Convolutional Neu-
ral Networks. In NIPS.
Lin, M., Chen, Q., and Yan, S. (2013). Network in network.
arXiv:1312.4400.
Rajpurkar, P., Migimatsu, T., Kiske, J., Cheng-Yue, R., Tan-
don, S., Wang, T., and Ng, A. (2015). Driverseat:
Crowdstrapping learning tasks for autonomous driv-
ing. arXiv preprint arXiv:1512.01872v1.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster
r-cnn: Towards real-time object detection with region
proposal networks. In NIPS.
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus,
R., and LeCun, Y. (2014). Overfeat: Integrated recog-
nition, localization and detection using convolutional
networks. In ICLR.
Simonyan, K. and Zisserman, A. (2014). Very deep con-
volutional networks for large-scale image recognition.
arXiv:1409.1556.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., and Rabi-
novich, A. (2014). Going deeper with convolutions.
arXiv:1409.4842.
Szegedy, C., Reed, S., Erhan, D., , and Anguelov, D.
(2015). Scalable, high-quality object detection.
arXiv:1412.1441 (v1).
Xiang, Y., Choi, W., Lin, Y., and Savarese, S. (2015). Data-
driven 3d voxel patterns for object category recogni-
tion. In CVPR.
Yang, F., Choi, W., and Lin, Y. (2016). Exploit all the lay-
ers: Fast and accurate cnn object detector with scale
dependent pooling and cascaded rejection classifiers.
In CVPR.
Zhu, Y., Urtasun, R., Salakhutdinov, R., and Fidler, S.
(2015). segdeepm: Exploiting segmentation and con-
text in deep neural networks for object detection. In
CVPR.
VEHITS 2017 - 3rd International Conference on Vehicle Technology and Intelligent Transport Systems
40