He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resi-
dual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Ioffe, S. and Szegedy, C. (2015). Batch normalization:
Accelerating deep network training by reducing inter-
nal covariate shift. arXiv preprint arXiv:1502.03167.
Kingma, D. P. and Ba, J. (2014). Adam: A method for sto-
chastic optimization. arXiv preprint arXiv:1412.6980.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Advances in neural information pro-
cessing systems, pages 1097–1105.
Lin, G., Milan, A., Shen, C., and Reid, I. D. (2017).
Refinenet: Multi-path refinement networks for high-
resolution semantic segmentation. In Cvpr, volume 1,
page 5.
Long, J., Shelhamer, E., and Darrell, T. (2015). Fully con-
volutional networks for semantic segmentation. In
Proceedings of the IEEE conference on computer vi-
sion and pattern recognition, pages 3431–3440.
Nair, V. and Hinton, G. E. (2010). Rectified linear units im-
prove restricted boltzmann machines. In Proceedings
of the 27th international conference on machine lear-
ning (ICML-10), pages 807–814.
Paszke, A., Chaurasia, A., Kim, S., and Culurciello, E.
(2016). Enet: A deep neural network architecture
for real-time semantic segmentation. arXiv preprint
arXiv:1606.02147.
Pohlen, T., Hermans, A., Mathias, M., and Leibe, B. (2017).
Fullresolution residual networks for semantic segmen-
tation in street scenes. arXiv preprint.
Poudel, R. P., Bonde, U., Liwicki, S., and Zach, C.
(2018). Contextnet: Exploring context and detail for
semantic segmentation in real-time. arXiv preprint
arXiv:1805.04554.
Romera, E., Alvarez, J. M., Bergasa, L. M., and Arroyo,
R. (2017). Efficient convnet for real-time semantic
segmentation. In IEEE Intelligent Vehicles Symp.(IV),
pages 1789–1794.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net:
Convolutional networks for biomedical image seg-
mentation. In International Conference on Medical
image computing and computer-assisted intervention,
pages 234–241. Springer.
Roy, S., Das, A., and Bhattacharya, U. (2016). Generali-
zed stacking of layerwise-trained deep convolutional
neural networks for document image classification. In
Pattern Recognition (ICPR), 2016 23rd International
Conference on, pages 1273–1278. IEEE.
Simonyan, K. and Zisserman, A. (2014). Very deep con-
volutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I.,
and Salakhutdinov, R. (2014). Dropout: a simple way
to prevent neural networks from overfitting. The Jour-
nal of Machine Learning Research, 15(1):1929–1958.
Treml, M., Arjona-Medina, J., Unterthiner, T., Durgesh, R.,
Friedmann, F., Schuberth, P., Mayr, A., Heusel, M.,
Hofmarcher, M., Widrich, M., et al. (2016). Speeding
up semantic segmentation for autonomous driving. In
MLITS, NIPS Workshop.
Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X.,
and Cottrell, G. (2018). Understanding convolution
for semantic segmentation. In 2018 IEEE Winter Con-
ference on Applications of Computer Vision (WACV),
pages 1451–1460. IEEE.
Xie, S., Girshick, R., Doll
´
ar, P., Tu, Z., and He, K. (2017).
Aggregated residual transformations for deep neural
networks. In Computer Vision and Pattern Recogni-
tion (CVPR), 2017 IEEE Conference on, pages 5987–
5995. IEEE.
Yang, M., Yu, K., Zhang, C., Li, Z., and Yang, K. (2018).
Denseaspp for semantic segmentation in street scenes.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pages 3684–3692.
Yu, F. and Koltun, V. (2015). Multi-scale context ag-
gregation by dilated convolutions. arXiv preprint
arXiv:1511.07122.
Zhang, Z., Zhang, X., Peng, C., Cheng, D., and Sun, J.
(2018). Exfuse: Enhancing feature fusion for seman-
tic segmentation. arXiv preprint arXiv:1804.03821.
Zhao, H., Qi, X., Shen, X., Shi, J., and Jia, J. (2017).
Icnet for real-time semantic segmentation on high-
resolution images. arXiv preprint arXiv:1704.08545.
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications
400