5 CONCLUSIONS
In this paper, we proposed a cooperative learning for
semantic segmentation that sends the feature maps of
top network to the other network. Specifically, we
evaluated our methods with two kinds of CNNs and
two connection methods. As a result, the effectiveness
of our method was demonstrated by experiments on
two datasets. Cooperative learning with the same
layer connection gave good performance for both
networks. However, the improvement of multiple
layer connection is small for DANet with attention
mechanism. Connection method depends on baseline
network structure. In this paper, we use two kinds of
connection but many connection methods can be
considered. This is a subject for future works.
ACKNOWLEDGEMENTS
This work is partially supported by MEXT/JSPS
KAKENHI Grant Number 18K111382.
REFERENCES
Krizhevsky, A., Sutskever, I., Hinton, G. E. “ImageNet
classification with deep Convolutional neural
networks”, In Advances in neural information
processing systems, pp.1097-1105, (2012)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich,
A.: Going deeper with convolutions. In: Proceedings of
the IEEE conference on Computer Vision and Pattern
Recognition. pp. 1–9 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only
look once:unified, real-time object detection. In:
Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition. pp. 779–788 (2016)
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.:
Openpose: realtime multi-person 2d pose estimation
using part affinity fields. arXiv preprint
arXiv:1812.08008 (2018)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image
translation with conditional adversarial networks. In:
Proceedings of the IEEE conference on Computer
Vision and Pattern Recognition. pp. 1125–1134 (2017)
Chen, L.C., Collins, M., Zhu, Y., Papandreou, G., Zoph, B.,
Schroff, F., Adam, H., Shlens, J.: Searching for efficient
multi-scale architectures for dense image prediction. In:
Advances in Neural Information Processing Systems.
pp. 8699–8710 (2018)
Havaei, M., Davy, A., Warde-Farley, D., Biard, A.,
Courville, A., Bengio, Y., Pal, C., Jodoin, P.M.,
Larochelle, H.: Brain tumor segmentation with deep
neural networks. Medical image analysis 35, 18–31
(2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional
networks for semantic segmentation. In: Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition. pp. 3431–3440 (2015)
Ding, H., Jiang, X., Shuai, B., Qun Liu, A., Wang, G.:
Context contrasted feature and gated multi-scale
aggregation for scene segmentation. In: Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition. pp. 2393–2402 (2018)
Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp
for semantic segmentation in street scenes. In:
Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition. pp. 3684–3692 (2018)
Peng, C., Zhang, X., Yu, G., Luo, G., Sun, J.: Large kernel
matters–improve semantic segmentation by global
convolutional network. In: Proceedings of the IEEE
conference on Computer Cision and Pattern
Recognition. pp. 4353–4361 (2017)
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu,
W.: Ccnet: Criss-cross attention for semantic
segmentation. In: Proceedings of the IEEE International
Conference on Computer Vision. pp. 603–612 (2019)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.:
Dual attention network for scene segmentation. In:
Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition. pp. 3146–3154 (2019)
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.:
Encoder-decoder with atrous separable convolution for
semantic image segmentation. In: Proceedings of the
European Conference on Computer Vision. pp. 801–
818 (2018)
Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi,
A., Agrawal, A.: Context encoding for semantic
segmentation. In: Proceedings of the IEEE conference
on Computer Vision and Pattern Recognition. pp.
7151–7160 (2018)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler,
M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The
cityscapes dataset for semantic urban scene
understanding. In: Proceedings of the IEEE conference
on Computer Vision and Pattern Recognition. pp.
3213–3223 (2016)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J.,
Zisserman, A.: The pascal visual object classes (voc)
challenge. International journal of computer vision
88(2), 303–338 (2010)
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A
deep convolutional encoder-decoder architecture for
image segmentation. IEEE Transactions on Pattern
Analysis and Machine Intelligence 39(12), 2481–2495
(2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net:
Convolutional networks for biomedical image
segmentation. In: International Conference on Medical
Image Computing and Computer-Assisted
Intervention. pp. 234–241. Springer (2015)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene
parsing network. In: Proceedings of the IEEE