Comparing Local Descriptors and Bags of Visual Words to Deep Convolutional Neural Networks for Plant Recognition

Pornntiwa Pawara, Emmanuel Okafor, Olarik Surinta, Lambert Schomaker, Marco Wiering


The use of machine learning and computer vision methods for recognizing different plants from images has attracted lots of attention from the community. This paper aims at comparing local feature descriptors and bags of visual words with different classifiers to deep convolutional neural networks (CNNs) on three plant datasets; AgrilPlant, LeafSnap, and Folio. To achieve this, we study the use of both scratch and fine-tuned versions of the GoogleNet and the AlexNet architectures and compare them to a local feature descriptor with k-nearest neighbors and the bag of visual words with the histogram of oriented gradients combined with either support vector machines and multi-layer perceptrons. The results shows that the deep CNN methods outperform the hand-crafted features. The CNN techniques can also learn well on a relatively small dataset, Folio.


  1. Arora, S., Bhaskara, A., Ge, R., and Ma, T. (2014). Provable bounds for learning some deep representations. In Machine Learning (ICML'14), International Conference on, pages 584-592.
  2. Bertozzi, M., Broggi, A., Del Rose, M., Felisa, M., Rakotomamonjy, A., and Suard, F. (2007). A pedestrian detector using histograms of oriented gradients and a support vector machine classifier. InIEEE Intelligent Transportation Systems Conference (ITSC'07), pages 143-148.
  3. Castelluccio, M., Poggi, G., Sansone, C., and Verdoliva, L. (2015). Land use classification in remote sensing images by convolutional neural networks. arXiv preprint arXiv:1508.00092.
  4. Couchot, J.-F., Couturier, R., Guyeux, C., and Salomon, M. (2016). Steganalysis via a convolutional neural network using large convolution filters. arXiv preprint arXiv:1605.07946.
  5. Csurka, G., Dance, C., Fan, L., Willamowski, J., and Bray, C. (2004). Visual categorization with bags of keypoints. In Computer Vision (ECCV'04), 8th European Conference on, volume 1, pages 1-22.
  6. Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition (CVPR'05), IEEE Computer Society Conference on, volume 1, pages 886-893.
  7. Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep sparse rectifier neural networks. Journal of Machine Learning Research (JMLR), 15(106):275.
  8. Grinblat, G. L., Uzal, L. C., Larese, M. G., and Granitto, P. M. (2016). Deep learning for plant identification using vein morphological patterns. Computers and Electronics in Agriculture, 127:418-424.
  9. Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097-1105.
  10. Kumar, N., Belhumeur, P. N., Biswas, A., Jacobs, D. W., Kress, W. J., Lopez, I. C., and Soares, J. V. (2012). Leafsnap: A computer vision system for automatic plant species identification. In Computer Vision (ECCV'12), European Conference on, pages 502-516. Springer.
  11. Latte, M., Shidnal, S., Anami, B., and Kuligod, V. (2015). A combined color and texture features based methodology for recognition of crop field image. International Journal of Signal Processing, Image Processing and Pattern Recognition, 8(2):287-302.
  12. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541-551.
  13. Lee, S. H., Chan, C. S., Wilkin, P., and Remagnino, P. (2015). Deep-plant: Plant identification with convolutional neural networks. In Image Processing (ICIP), 2015 IEEE International Conference on, pages 452-456.
  14. Lin, M., Chen, Q., and Yan, S. (2013). Network in network. arXiv preprint arXiv:1312.4400.
  15. Mohanty, S. P., Hughes, D. P., and Salathé, M. (2016). Using deep learning for image-based plant disease detection. CoRR, abs/1604.03169.
  16. Møller, M. F. (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural networks, 6(4):525- 533.
  17. Munisami, T., Ramsurn, M., Kishnah, S., and Pudaruth, S. (2015). Plant leaf recognition using shape features and colour histogram with K-nearest neighbour classifiers. Computer Vision and the Internet (VisionNet'15), Second International Symposium on, Procedia Computer Science, 58:740 - 747.
  18. Nilsback, M.-E. and Zisserman, A. (2008). Automated flower classification over a large number of classes. In Computer Vision, Graphics & Image Processing (ICVGIP'08), Sixth Indian Conference on, pages 722- 729. IEEE.
  19. Nilsback, M.-E. and Zisserman, A. (2010). Delving deeper into the whorl of flower segmentation. Image and Vision Computing, 28(6):1049-1062.
  20. Sladojevic, S., Arsenovic, M., Anderla, A., Culibrk, D., and Stefanovic, D. (2016). Deep neural networks based recognition of plant diseases by leaf image classification. Computational Intelligence and Neuroscience, 2016:1-11.
  21. Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929-1958.
  22. Surinta, O., Karaaba, M. F., Mishra, T. K., Schomaker, L. R., and Wiering, M. A. (2015). Recognizing handwritten characters with local descriptors and bags of visual words. In Engineering Applications of Neural Networks, pages 255-264. Springer.
  23. Suykens, J. A. and Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural processing letters, 9(3):293-300.
  24. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015). Going deeper with convolutions. In Computer Vision and Pattern Recognition (CVPR'15), the IEEE Conference on.
  25. Tsai, C.-F. (2012). Bag-of-words representation in image annotation: A review. ISRN Artificial Intelligence , 2012.
  26. Wang, X., Wang, L., and Qiao, Y. (2013). A comparative study of encoding, pooling and normalization methods for action recognition. In Lee, K. M., Matsushita, Y., Rehg, J. M., and Hu, Z., editors, Computer Vision (ACCV'12), 11th Asian Conference on, pages 572-585. Springer Berlin Heidelberg.
  27. Wang, Z., Sun, X., Ma, Y., Zhang, H., Ma, Y., Xie, W., and Zhang, Y. (2014). Plant recognition based on intersecting cortical model. In Neural Networks (IJCNN'14), International Joint Conference on, pages 975-980. IEEE.
  28. Xiao, X.-Y., Hu, R., Zhang, S.-W., and Wang, X.-F. (2010). HOG-based approach for leaf classification. In Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence , pages 149- 155. Springer.
  29. Xing, L. and Qiao, Y. (2016). Deepwriter: A multi-stream deep CNN for text-independent writer identification. In Frontiers in Handwriting Recognition (ICFHR'16), 15th International Conference on, pages 1-6.
  30. Yoo, H.-J. (2015). Deep convolution neural networks in computer vision. IEIE Transactions on Smart Processing & Computing (IEIE SPC'1578), 4(1):35-43.
  31. Zhao, C., Chan, S. S., Cham, W.-K., and Chu, L. (2015). Plant identification using leaf shapes - A pattern counting approach. Pattern Recognition, 48(10):3203-3215.

Paper Citation

in Harvard Style

Pawara P., Okafor E., Surinta O., Schomaker L. and Wiering M. (2017). Comparing Local Descriptors and Bags of Visual Words to Deep Convolutional Neural Networks for Plant Recognition . In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-222-6, pages 479-486. DOI: 10.5220/0006196204790486

in Bibtex Style

author={Pornntiwa Pawara and Emmanuel Okafor and Olarik Surinta and Lambert Schomaker and Marco Wiering},
title={Comparing Local Descriptors and Bags of Visual Words to Deep Convolutional Neural Networks for Plant Recognition},
booktitle={Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

in EndNote Style

JO - Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Comparing Local Descriptors and Bags of Visual Words to Deep Convolutional Neural Networks for Plant Recognition
SN - 978-989-758-222-6
AU - Pawara P.
AU - Okafor E.
AU - Surinta O.
AU - Schomaker L.
AU - Wiering M.
PY - 2017
SP - 479
EP - 486
DO - 10.5220/0006196204790486