Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN
Ala Mhalla, Thierry Chateau, Sami Gazzah, Najoua Essoukri Ben Amara
2017
Abstract
The performance of a generic pedestrian detector decreases significantly when it is applied to a specific scene due to the large variation between the source dataset used to train the generic detector and samples in the target scene. In this paper, we suggest a new approach to automatically specialize a scene-specific pedestrian detector starting with a generic detector in video surveillance without further manually labeling any samples under a novel transfer learning framework. The main idea is to consider a deep detector as a function that generates realizations from the probability distribution of the pedestrian to be detected in the target. Our contribution is to approximate this target probability distribution with a set of samples and an associated specialized deep detector estimated in a sequential Monte Carlo filter framework. The effectiveness of the proposed framework is demonstrated through experiments on two public surveillance datasets. Compared with a generic pedestrian detector and the state-of-the-art methods, our proposed framework presents encouraging results.
References
- Duan, L., Tsang, I. W., Xu, D., and Maybank, S. J. (2009). Domain transfer svm for video concept detection. In CVPR, pages 1375-1381. IEEE.
- Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. (2010). The pascal visual object classes (voc) challenge. IJCV.
- Glorot, X., Bordes, A., and Bengio, Y. (2011). Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 513-520.
- Goodfellow, I. J., Courville, A., and Bengio, Y. (2012). Spike-and-slab sparse coding for unsupervised feature discovery. arXiv.
- Guyon, I., Dror, G., Lemaire, V., Taylor, G., and Aha, D. W. (2011). Unsupervised and transfer learning challenge. In IJCNN, pages 793-800. IEEE.
- Htike, K. K. and Hogg, D. C. (2014). Efficient non-iterative domain adaptation of pedestrian detectors to video scenes. In 2014 22nd International Conference on Pattern Recognition (ICPR), pages 654-659. IEEE.
- Huang, G. B., Lee, H., and Learned-Miller, E. (2012). Learning hierarchical representations for face verification with convolutional deep belief networks. In CVPR, pages 2518-2525. IEEE.
- Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In ACM, pages 675-678. ACM.
- LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, pages 2278-2324.
- Li, X., Ye, M., Fu, M., Xu, P., and Li, T. (2015). Domain adaption of vehicle detector based on convolutional neural networks. International Journal of Control, Automation and Systems, pages 1020-1031.
- Maaˆmatou, H., Chateau, T., Gazzah, S., Goyat, Y., and Essoukri Ben Amara, N. (2016). Transductive transfer learning to specialize a generic classifier towards a specific scene. In VISAPP, pages 411-422.
- Mesnil, G., Dauphin, Y., Glorot, X., Rifai, S., Bengio, Y., Goodfellow, I. J., Lavoie, E., Muller, X., Desjardins, G., Warde-Farley, D., et al. (2012). Unsupervised and transfer learning challenge: a deep learning approach. ICML Unsupervised and Transfer Learning, pages 97-110.
- Nair, V. and Clark, J. J. (2004). An unsupervised, online learning framework for moving object detection. In CVPR, pages II-317. IEEE.
- Ren, S., He, K., Girshick, R. B., and Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. CoRR.
- Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- Smith, A., Doucet, A., de Freitas, N., and Gordon, N. (2013). Sequential Monte Carlo methods in practice. Springer Science & Business Media.
- Taigman, Y., Yang, M., Ranzato, M. A., and Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In CVPR, pages 1701-1708.
- Wang, M., Li, W., and Wang, X. (2012). Transferring a generic pedestrian detector towards specific scenes. In CVPR, pages 3274-3281. IEEE.
- Wang, X., Ma, X., and Grimson, W. E. L. (2009). Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. PAMI, pages 539-555.
- Wang, X., Wang, M., and Li, W. (2014). Scene-specific pedestrian detection for static video surveillance. PAMI, pages 361-362.
- Will Y. Zou, Serena Y. Yeung, A. Y. N. (2011). Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CSD, pages 3361-3368.
- Yunxiang Mao, Z. Y. (2015). Training a scene-specific pedestrian detector using tracklets. pages 170-176.
- Zeng, X., Ouyang, W., Wang, M., and Wang, X. (2014). Deep learning of scene-specific classifier for pedestrian detection. In ECCV, pages 472-487. Springer.
Paper Citation
in Harvard Style
Mhalla A., Chateau T., Gazzah S. and Essoukri Ben Amara N. (2017). Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN . In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2017) ISBN 978-989-758-225-7, pages 17-23. DOI: 10.5220/0006097900170023
in Bibtex Style
@conference{visapp17,
author={Ala Mhalla and Thierry Chateau and Sami Gazzah and Najoua Essoukri Ben Amara},
title={Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN},
booktitle={Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2017)},
year={2017},
pages={17-23},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006097900170023},
isbn={978-989-758-225-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, (VISIGRAPP 2017)
TI - Specialization of a Generic Pedestrian Detector to a Specific Traffic Scene by the Sequential Monte-Carlo Filter and the Faster R-CNN
SN - 978-989-758-225-7
AU - Mhalla A.
AU - Chateau T.
AU - Gazzah S.
AU - Essoukri Ben Amara N.
PY - 2017
SP - 17
EP - 23
DO - 10.5220/0006097900170023