Improving Quality of Training Samples Through Exhaustless Generation and Effective Selection for Deep Convolutional Neural Networks
Takayoshi Yamashita, Taro Watasue, Yuji Yamauchi, Hironobu Fujiyoshi
2015
Abstract
Deep convolutional neural networks require a huge amount of data samples to train efficient networks. Al- though many benchmarks manage to create abundant samples to be used for training, they lack efficiency when trying to train convolutional neural networks up to their full potential. The data augmentation is one of the solutions to this problem, but it does not consider the quality of samples, i.e. whether the augmented samples are actually suitable for training or not. In this paper, we propose a method that will allow us to select effective samples from an augmented sample set. The achievements of our method were 1) to be able to generate a large amount of augmented samples from images with labeled data and multiple background images; and 2) to be able to select effective samples from the additionally augmented ones through iterations of parameter updating during the training process. We utilized exhaustless sample generation and effective sample selection in order to perform recognition and segmentation tasks. It obtained the best performance in both tasks when compared to other methods using, or not, sample generation and/or selection.
References
- Boureau, Y., Bach, F., LeCun, Y., and j. Ponce (2010). Learning mid-level features for recognition. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR2010).
- Ciresan, D., Meier, U., and Schmidhuber, J. (2012). Multicolumn deep neural networks for image classification. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR2012).
- Delakisand, M. and Garcia, C. (2008). Text detection with convolutional neural networks. In InternationalConference on Computer Vision Theory and Applications (VISAPP 2008).
- Duffer, S. and Garcia, C. (2007). An online backpropagation algorithm with validation error-based adaptive learning rate. In In International Conference on Artificial Neural Networks (ICANN), volume 1, pages 958- 962.
- Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013). Maxout networks. In arXiv preprint arXiv:1302.4389.
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. In arXiv preprint arXiv:1207.0580.
- Hubel, D. and Wiesel, T. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. Journal of Physiology, 160:106-154.
- Jain, V., Murray, J., Roth, F., and Turaga, S. (2007). Supervised learning of image restoration with convolutional networks. In IEEE International Conference on Computer Vision (ICCV2007).
- Osadchy, M., LeCun, Y., and Mille, M. (2007). Synergistic face detection and pose estimation with energy-based models. In Journal of Machine Learning Research, number 1197-1215, page 8.
- Ouyang, W. and Wang, X. (2013a). Joint deep learning for pedestrian detection. In IEEE International Conference on Computer Vision (ICCV2013).
- Ouyang, W. and Wang, X. (2013b). Single-pedestrian detection aided by multi-pedestrian detection. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR2013).
- Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error propagation. Parallel Distributed Processing:Explorations in the Microstructures of Cognition, 1:318-362.
- Scherer, D., Muller, A., and S.Behnke (2010). Evaluation of pooling operations in convolutional architectures for object recognition. In International Conference on Artificial Neural Networks(ICANN2010).
- Sermanet, P., Chintala, S., and LeCun, Y. (2012). Convolutional neural networks applied to house numbers digit classification. In International Conference on Pattern Recognition (ICPR 2012).
- Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., and LeCun, Y. (2013). Overfeat: Integrated recognition, localization and detection using convolutional networks. In arXiv preprint arXiv:1312.6229.
- Simard, P., Steinkraus, D., and Platt, J. (2003). Best practices for convolutional neural networks applied to visual document analysis. In In International Conference on Document Analysis and Recognition, volume 2, pages 958-962.
- Sutskever, A. K. I. and Hinton, G. (2012). magenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25(NIPS2012).
- Vaillant, R., Monrocq, C., and LeCun, Y. (1994). Original approach for the localisation of objects in images. volume 4, pages 245-250.
- Wan, L., Zeiler, M., Zhang, S., LeCun, Y., and Fergus, R. (2013). Regularization of neural networks using dropconnect. In In International Conference on Machine Learning (ICML2013).
- Y. LeCun ad L. Bottou, Y. B. and P.Haffner (1998). Gradient-basedlearning applied to document recognition. In Proceedings of the IEEE, 86(11):2278-2324.
- Zeiler, M. D. and Fergus, R. (2013). Visualizing and understanding convolutional networks. In arXiv preprint arXiv:1311.2901.
Paper Citation
in Harvard Style
Yamashita T., Watasue T., Yamauchi Y. and Fujiyoshi H. (2015). Improving Quality of Training Samples Through Exhaustless Generation and Effective Selection for Deep Convolutional Neural Networks . In Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015) ISBN 978-989-758-090-1, pages 228-235. DOI: 10.5220/0005263802280235
in Bibtex Style
@conference{visapp15,
author={Takayoshi Yamashita and Taro Watasue and Yuji Yamauchi and Hironobu Fujiyoshi},
title={Improving Quality of Training Samples Through Exhaustless Generation and Effective Selection for Deep Convolutional Neural Networks},
booktitle={Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)},
year={2015},
pages={228-235},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005263802280235},
isbn={978-989-758-090-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 10th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2015)
TI - Improving Quality of Training Samples Through Exhaustless Generation and Effective Selection for Deep Convolutional Neural Networks
SN - 978-989-758-090-1
AU - Yamashita T.
AU - Watasue T.
AU - Yamauchi Y.
AU - Fujiyoshi H.
PY - 2015
SP - 228
EP - 235
DO - 10.5220/0005263802280235