original model and the model that was trained slightly
to increase the confidence level of the output class.
As a result, we achieved the accuracy improvement in
two evaluation measures. The visualization results
also confirmed that the visualization of the basis for
decision-making was of better quality than that of
existing methods.
ACKNOWLEDGEMENTS
This research is partially supported by JSPS
KAKENHI Grant Number 22H04735.
REFERENCES
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012).
Imagenet classification with deep convolutional neural
networks. Advances in Neural Information Processing
Systems, pp. 1097-1105.
LeCun, Y., Boser, B., Denker, J. S., Henderson, D.,
Howard, R. E., Hubbard, W., & Jackel, L. D. (1989).
Backpropagation Applied to Handwritten Zip Code
Recognition. Neural Computation, 1(4), pp. 541-551.
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H.,
... & Tang, X. (2017). Residual Attention Network for
Image Classification. In Proceedings of The IEEE
Conference on Computer Vision and Pattern
Recognition, pp. 3156-3164.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.,
Anguelov, D., ... & Rabinovich, A. (2015). Going
Deeper with Convolutions. In Proceedings of The IEEE
Conference on Computer Vision and Pattern
Recognition, pp. 1-9.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., & Reed, S.,
(2016) SSD: Single Shot Multibox Detector.
In Proceedings of European Conference on Computer
Vision, pp. 21-37.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016).
You only look once: Unified, Real-Time Object
Detection. In Proceedings of The IEEE Conference on
Computer Vision and Pattern Recognition, pp. 779-788.
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better,
Faster, Stronger. In Proceedings of The IEEE
Conference on Computer Vision and Pattern
Recognition, pp. 7263-7271.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014).
Rich feature hierarchies for accurate object detection
and semantic segmentation. In Proceedings of The
IEEE Conference on Computer Vision and Pattern
Recognition, pp. 580-587.
Girshick, R. (2015). Fast R-CNN. In Proceedings of The
IEEE International Conference on Computer Vision, pp.
1440-1448.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014).
Generative Adversarial Nets. Advances in Neural
sInformation Processing Systems, pp. 2672-2680.
Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised
Representation Learning with Deep Convolutional
Generative Adversarial Networks. In Proceedings of
The International Conference on Learning
Representations.
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-
to-Image Translation with Conditional Adversarial
Networks. In Proceedings of The IEEE Conference on
Computer Vision and Pattern Recognition, pp. 1125-
1134.
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba,
A. (2016). Learning Deep Features for Discriminative
Localization. In Proceedings of The IEEE Conference
on Computer Vision and Pattern Recognition, pp.
2921-2929.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R.,
Parikh, D., & Batra, D. (2017). Grad-cam: Visual
Explanations from Deep Networks via Gradient-based
Localization. In Proceedings of The IEEE International
Conference on Computer Vision, pp. 618-626.
Wang, H., Wang, Z., Du, M., Yang, F., Zhang, Z., Ding,
S., ... & Hu, X. (2020). Score-CAM: Score-Weighted
Visual Explanations for Convolutional Neural
Networks. In Proceedings of The IEEE/CVF
Conference on Computer Vision and Pattern
Recognition Workshops, pp. 24-25.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,
Ma, S., ... & Fei-Fei, L. (2015). Imagenet Large Scale
Visual Recognition Challenge. International Journal of
Computer Vision, Vol.115, pp. 211-252.