Identiﬁcation of Diseases in Corn Leaves using Convolutional Neural

Networks and Boosting

Prakruti Bhatt, Sanat Sarangi, Anshul Shivhare, Dineshkumar Singh and Srinivasu Pappula

TCS Research and Innovation, Mumbai, India

Keywords:

Disease Classiﬁcation, Adaptive Boosting, Ensemble Classiﬁer, CNN Features.

Abstract:

Precision farming technologies are essential for a steady supply of healthy food for the increasing population

around the globe. Pests and diseases remain a major threat and a large fraction of crops are lost each year

due to them. Automated detection of crop health from images helps in taking timely actions to increase yield

while helping reduce input cost. With an aim to detect crop diseases and pests with high conﬁdence, we use

convolutional neural networks (CNN) and boosting techniques on Corn leaf images in different health states.

The queen of cereals, Corn, is a versatile crop that has adapted to various climatic conditions. It is one of the

major food crops in India along with wheat and rice. Considering that different diseases might have different

treatments, incorrect detection can lead to incorrect remedial measures. Although CNN based models have

been used for classiﬁcation tasks, we aim to classify similar looking disease manifestations with a higher

accuracy compared to the one obtained by existing deep learning methods. We have evaluated ensembles

of CNN based image features, with a classiﬁer and boosting in order to achieve plant disease classiﬁcation.

Using an ensemble of Adaptive Boosting cascaded with a decision tree based classiﬁer trained on features from

CNN, we have achieved an accuracy of 98% in classifying the Corn leaf images into four different categories

viz. Healthy, Common Rust, Late Blight and Leaf Spot. This is about 8% improvement in classiﬁcation

performance when compared to CNN only.

1 INTRODUCTION

Convolutional Neural networks (CNN) based deep

learning methods are proving quite useful for im-

age classiﬁcation tasks as they can learn the high

level features effectively. CNN’s have made tremen-

dous advances in computer vision tasks especially

in object classiﬁcation (He et al., 2016; Chollet,

2016; Szegedy et al., 2016; Simonyan and Zisser-

man, 2014). Considering Large Scale Visual Recog-

nition Challenge (Russakovsky et al., 2015) based

on ImageNet dataset (Deng et al., 2009), the bench-

mark for error rates, CNN models have achieved the

lowest error rate of 3.57% (He et al., 2016) which

is comparable to human error rate. It is also ob-

served (Sharif Razavian et al., 2014) that extracting

features of a new dataset from a deep network pre-

trained on ImageNet database (Deng et al., 2009) and

training Support Vector Machine (SVM) (Cortes and

Vapnik, 1995) using these features performs better

classiﬁcation than other complex supervised classi-

ﬁcation approaches. This motivates us to leverage

the high level features extracted from the trained con-

volutional neural networks which have been recently

indicated to be very robust (Yosinski et al., 2014).

Sharada Mohanty et. al in (Mohanty et al., 2016)

have performed supervised leaf disease classiﬁcation

with 99.35% accuracy by ﬁne tuning the top layer of

CNN models with a dataset taken in near ideal condi-

tions. Erika Fujita et. al in (Fujita et al., 2016) have

proposed a CNN based classiﬁer trained on cucum-

ber viral diseases and achieved 82.3% average classi-

ﬁcation accuracy. We have explored the possibility of

using deep CNN model pre-trained on the ImageNet

database with 1000 classes of over 14 million im-

ages to extract the features of corn leaf images. Var-

ious methods are evaluated to develop a solution that

gives the most accurate recognition results especially

in similar looking disease manifestations.

We propose a system where the images are clas-

siﬁed using features from a convolutional neural net-

work pretrained on ImageNet data and then boosting

is applied to accurately differentiate between similar

looking classes in accordance with the confusion ma-

trix. Classiﬁcation performance of features from dif-

ferent CNN architectures viz. VGG-16, Inception-v2,

894

Bhatt, P., Sarangi, S., Shivhare, A., Singh, D. and Pappula, S.

Identiﬁcation of Diseases in Corn Leaves using Convolutional Neural Networks and Boosting.

DOI: 10.5220/0007687608940899

In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2019), pages 894-899

ISBN: 978-989-758-351-3

ResNet-50 and MobileNet-v1 used with 3 different

classiﬁers viz. Softmax, Random Forest and SVM

have been presented in this paper. We achieved a rea-

sonably good performance by using Adaptive boost-

ing (AdaBoost) after getting class probabilities from a

decision tree based classiﬁer. Features extracted from

a pre-trained Inception-v2 network used along with

this ensemble gave the highest accuracy. The ten-

sorﬂow (Abadi et al., 2015) implementation of CNN

models has been used to extract the feature vector

of the images. Scikit learn library (Pedregosa et al.,

2011) has been used for application of Random forest

and AdaBoost methods.

2 DATASET AND

PREPROCESSING

We have utilized corn leaf images from PlantVillage

dataset (Hughes and Salath

e, 2015) for 4 health con-

ditions, viz. Healthy, Common Rust, Late Blight and

Leaf Spot. 500 images have been randomly taken

from each class. Data augmentation with rotation,

ﬂipping and addition of salt and pepper noise have

been done to avoid overﬁtting of the model and get

better accuracy on the images taken in different con-

ditions. This resulted in 2000 images of each class.

Augmentation helped increase the data quality as well

as quantity of images for training the classiﬁcation

models. Figure 1 shows the four classes of the im-

ages we have considered from the database. Images

were resized according the the input size of the neu-

ral networks. We train and evaluate the models after

performing normalization on the image data. Nor-

malization of every image is performed for scaling

the data to an acceptable range for the network. Im-

age normalization results in contrast stretching, so it

also enhances the poor contrast images in the dataset.

Mean subtraction centers the data around zero mean

for each channel and normalization binds the range

of the image data values, thus helping the network

to learn faster since gradients act uniformly for each

channel as well as for all image data values. The im-

ages are resized according to the input size require-

ments of the CNN models. For VGG-16, MobileNet-

v1 and ResNet-50, images are resized to 224x224x3,

and for Inception-v2 they are resized to 299x299x3.

(a) (b) (c) (d)

Figure 1: Corn leaf Images from the PlantVillage database

(a) Common Rust (b) Healthy (c) Late Blight (d) Leaf Spot.

3 CNN FEATURES OF IMAGE

DATA

A CNN is made up of an arrangement of convolu-

tional layers that can be seen as a linear transforma-

tion over the image, followed by activation layer to

add non linearity in the network and then the pooling

layer to reduce the propagation of the redundancy in

the image in consecutive layers. A convolution layer

in CNN extracts features of an input image while

preserving spatial relation between pixels by using a

small matrix that strides over the input image. This

resulting image is called an Activation map or a Fea-

ture map. Rectiﬁed Linear Unit (ReLU), an element

wise activation function max(0,x) replaces all negative

pixel values in the feature map by zero. Activation

functions introduce non-linearity in the CNN as most

of real-world data that CNN would be used to learn

is non-linear. Spatial Pooling, i.e. downsampling is

applied on the feature map after ReLU to reduce the

dimensionality. This reduces the number of parame-

ters and computations in the network thereby reduc-

ing overﬁtting (Krizhevsky et al., 2012). It makes the

feature invariant to scaling and small distortions in the

input image. The last layer of a CNN is a Fully Con-

nected (FC) neural network layer. Adding FC helps

the network to learn the non-linear combination of

features computed from convolutional layers. The FC

layer is followed by an average or a max pooling layer

for a classiﬁcation task.

We evaluate performance of features from VGG-

16 (Simonyan and Zisserman, 2014), Inception-

v2 (Szegedy et al., 2016), ResNet-50 (He et al., 2016)

and MobileNet-v1 (Howard et al., 2017) trained on

ImageNet database for classiﬁcation of corn leaf

health state, as it has been seen that the models trained

on this vast database generalize well on other datasets

too after transfer learning (Zeiler and Fergus, 2014).

VGG-16 is a sequential CNN with 8 convolutional

layers having different number of ﬁlters with 3 × 3

receptive ﬁelds. Inception-v2 has blocks of multiple

ﬁlters that are applied on the same tensor and then

concatenated at the output of each block. It can be

termed as a CNN made up of small convolutional

Identiﬁcation of Diseases in Corn Leaves using Convolutional Neural Networks and Boosting

895

Figure 2: Proposed ensemble method of classiﬁcation.

modules. ResNet-50 is 50 layered deep CNN with

residual blocks which can be termed as shortcut con-

nections between the layers. These residual connec-

tions help in combating the problem of vanishing gra-

dient in case of networks with large number of lay-

ers, thus helping in better training of the network and

increasing the accuracy. In MobileNet, the normal

convolution is replaced by depthwise convolution fol-

lowed by pointwise convolution. This is called depth-

wise separable convolution and signiﬁcantly reduces

the number of parameters compared to the normal

convolutions for a network with the same depth.

4 METHOD

We perform feature extraction by forward passing

an image through a trained convolutional neural net-

work. These features are then fed to the classiﬁcation

module in order to accurately classify it into one of the

four classes of corn health condition. In case the con-

ﬁdence level of the classiﬁcation result i.e. the prob-

ability of predicted class is not satisfactorily high, the

boosting method is utilized in the cascade to conﬁ-

dently predict the correct class. Figure 2 illustrates

the classiﬁcation approach that we have used in or-

der to get maximum accuracy in predicting the cor-

rect health condition from the corn leaf image. If the

class label is denoted as {c

}

i=1

where C is the total

number of classes, the classiﬁcation output would be

the C length array P of probabilities with which the

image belongs to each class. It can be denoted as P

= {p

}

i=1

for {p

= p(c

/ f

,W )}

i=1

where f

are the

features of the image, p is probability of each class

and W denotes the classiﬁer parameters. As f

are

obtained using a neural network, considering all the

network layers as a non-liner transformation of im-

age pixels x, we can denote f

= W

x + b

where W

and b

represent CNN model parameters. Hence the

probability or the conﬁdence of classiﬁcation depends

on the CNN models for feature extraction as well as

classiﬁcation.

4.1 Feature Extraction

Robust feature extraction is one of the most impor-

tant steps in order to achieve high classiﬁcation ac-

curacy in crop images because there can be a lot of

variations within the images of single class. These

variations can be due to different severity levels of

diseases or pests, changes in light conditions, varia-

tions in size of the leaves and different growth stages

of the crops. Hence, we evaluate different types of

convolutional neural networks as feature extractors

and different classiﬁers to classiﬁy health condition of

corn leaves. Augmentation of image dataset is done

in order to incorporate the variations in the images

that would be captured in uncontrolled conditions.

The top layers of deep CNNs - VGG-16, Inception-

v2 and MobileNet-v1 pre-trained on ImageNet data

have been re-trained with images from each class cor-

responding to Healthy, Common Rust, Late Blight,

Leaf Spot conditions of corn leaves. As the con-

sidered CNN models have been trained on a large

and varied database, they are seen to generalize well

on other datasets too for classiﬁcation using trans-

fer learning (Zeiler and Fergus, 2014)). This helps

us to utilize the optimal weights of deep architec-

tures learned through large visual data. While train-

ing these CNN models, the softmax layer in each of

these is replaced by a 4-neuron softmax layer. Then

the weights of lower layers are ﬁxed and the top lay-

ers of the network are ﬁne-tuned by the corn leaf im-

ages. In case of VGG-16, for example, output of the

pooling layer on top of the other networks is taken

as the feature vector because higher levels of network

learn generalized features. The topmost convolutional

block before the max pooling and the three FC lay-

ers in VGG-16 were retrained, and output of topmost

FC layer with 1000 neurons is taken as feature vec-

tor. For Inception-v2, ResNet-50 and MobileNet-v1,

the last convolutional block and the FC layer are re-

trained and output of average pooling layer before the

last FC layer is taken as the feature vector.

4.2 Classiﬁcation

The standard classiﬁcation method used with a CNN

is using a dense layer of Fully Connected (FC) neu-

rons and a softmax layer in order to get the probabil-

ity that the given image belongs to a particular class.

Adding FC layer helps the network to learn the non

linear combination of features computed from convo-

lutional layers followed by pooling for classiﬁcation.

The softmax layer at output of FC layer ensures that

sum of output probabilities is 1. The softmax function

takes arbitrary-sized real-valued vector and outputs

ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods

896

(a) (b) (c) (d)

Figure 3: Confusion matrices for CNN based classiﬁers without boosting: (a) VGG-16 (b) Inception-v2 (c) ResNet-50 (d)

MobileNet-v1.

a probability vector of size [1x number of classes].

Number of neurons of softmax layer is equal to num-

ber of classes C. However, features from the trained

CNN can also be fed into other classiﬁcation algo-

rithms like Support Vector Machine (SVM) (Hearst,

1998) or Random Forest (RF) (Breiman, 2001) in

place of using a softmax classiﬁer. We evaluated three

methods for classiﬁcation on the features extracted

from the neural network: Softmax, Random Forest,

and Support Vector Machine.

Random Forest classiﬁer builds multiple decision

trees and merges them to get a more accurate and

stable prediction. While training, it sets a stopping

criteria for node splits resulting in utilization of the

entire feature space with a control in correlation be-

tween the trees. This helps in managing the trade-off

between bias and variance. While RF is based on de-

cision trees, SVM is a linear classiﬁer, based on the

idea of getting a best hyperplane to divide the data in

two classes. The hyperplane with the greatest possible

margin between itself and any point within the train-

ing set is considered to be best as it has a higher proba-

bility of new data being classiﬁed correctly. Different

kernels like Radial Basis Function (RBF) can be used

to map a higher dimensional data in a space where

a linear separation is possible. The best performing

classifer is then followed by boosting for increasing

the overall accuracy when used as an ensemble.

4.3 Boosting

Boosting helps the base classiﬁer to form a strong

rule for separation between the classes. We have used

Adaptive booosing (AdaBoost) (Freund and Schapire,

1997) on top of the base classiﬁcation algorithm i.e.

softmax, RF or SVM in this case to increase the accu-

racy between the two weakly classiﬁed classes. Ad-

aboost is best used to boost the performance of de-

cision trees as it is a sequential ensemble that aims

to convert a set of weak classiﬁers or learners into a

strong one. Each learner is added sequentially while

training and trained using adaptively weighted train-

ing data. Every learner is assigned a weight and

a more accurate one is given a higher weight. It-

eratively, the learner(s) are added till the limit is

reached or the accuracy stops increasing. Initially,

equal weight is given to each image feature and if the

prediction is incorrect in the ﬁrst stage then a higher

weight is given to such an image in the next iterations.

So the idea is that the weights of classifers as well as

the data points are set in such a way that the weightage

of the classiﬁers is more on the points that are difﬁcult

to classiﬁy. If none of the output probability using the

base classiﬁer (CNN model with RF) exceeds the con-

ﬁdence level of 50%, i.e. if max(p) < 0.5, we use the

next level of adaptive boosting to increase the conﬁ-

dence of classiﬁcation and assign a probable health

condition to the input image.

5 RESULTS AND EVALUATION

For our experiments, we retrain the models using

transfer learning as mentioned in Sec. 3. The dataset

has been split into 3 sets viz. training, cross vali-

dation and test. Using test images which the neural

network has never seen, we get a more generalized

measure of classiﬁcation accuracy whereas cross val-

idation data is used to tune the network parameters to

prevent over-ﬁtting or bias while training. The CNN

models have been ﬁne-tuned with the corn leaf im-

ages in a batch of 32 for every iteration using SGD

optimizer with the learning rate of 0.001 with Ne-

strov momentum. We evaluated these re-trained CNN

models for classiﬁcation accuracy obtained on same

test image set for all 4 classes with corn leaf health

conditions. A total of 2000 images were taken from

PlantVillage dataset where 1600 were used for train-

ing and the rest for validation. The 244 test images

were taken randomly and not used for training and

validation of the classiﬁcation models. The test data

Identiﬁcation of Diseases in Corn Leaves using Convolutional Neural Networks and Boosting

897

Figure 4: Confusion matrix: SVM on Inception-v2 features.

Table 1: Classiﬁcation accuracy (%) for different methods.

Architecture Softmax Random Forest SVM

VGG-16 85 87 85

Inception-v2 86 90 87

ResNet-50 73 72 80

MobileNet-v1 80 82 82

had 55 images of Healthy and Common Rust condi-

tion each, 67 images affected by Late Blight and Leaf

Spot each. Equivalent scores for such test data also

show that the models do not suffer from bias or over-

ﬁtting.

Apart from Softmax, we evaluated RF and SVM

that take features extracted from CNNs as input and

classify them into 4 classes. Table 1 shows the

average accuracy obtained on same test data when

we used VGG-16, Inception-v2, MobileNet-v1 and

ResNet-50 for classiﬁcation using Softmax, RF and

SVM classiﬁers. It is seen that for images taken in

different conditions like size, resolution, angle and

brightness the solution that uses Inception-v2 for fea-

ture extraction and Random Forest with 100 decision

trees for classiﬁcation gives the average classﬁcation

accuracy of 90% which is maximum of all. Once

we observed that RF performs better than Softmax,

we also experimented to classiﬁy CNN features with

SVM classiﬁer. We used RBF kernel SVM with pa-

rameters ‘C = 1.0’ and ‘gamma = 0.1’ values and

obtained about 87% accuracy with Inception-v2 fea-

tures, which is lower than that of RF. The confusion

matrix for SVM classifer over Inception-v2 features

for comparison is shown in Figure 4 while that for RF

over same features is shown in Figure 3(b). Hence

we selected a model based on features from Inception

and RF to develop an ensemble for classiﬁcation.

It was observed from confusion matrix for every

classiﬁer as seen in Figure 3 as well as Figure 4 that

for all of the classiﬁcation methods, there is most

Table 2: Classiﬁcation scores for corn crop health using Ad-

aboost.

Leaf state Precision Recall f1-score

Healthy 1 1 1

Common Rust 0.98 0.98 0.98

Leaf Spot 0.96 0.95 0.94

Late Blight 0.97 0.98 0.96

Total 0.97 0.98 0.97

confusion in differentiating between Late Blight and

Leaf Spot. We have evaluated accuracy over differ-

ent CNN architectures as well as classiﬁcation meth-

ods and then selected the one with highest accuracy

to ensemble with AdaBoost. After adding Adaboost

on the next level after Inception-v2 features classiﬁed

by Random Forest, the accuracy increased to 98% be-

cause the classiﬁcation accuracy between Leaf Spot

and Late Blight increased through boosting as seen in

Figure 5. We used adaptive boosting with 100 deci-

sion tree based estimators with learning rate of 1.0.

Table 2 shows the precision, recall and F1-score of

the proposed ensemble method.

Figure 5: Confusion matrix: Classiﬁcation with the pro-

posed ensemble with CNN, RF and AdaBoost.

6 CONCLUSION AND FUTURE

WORK

Through the proposed system utilizing transferability

of CNN features along with boosting, we achieved

a test accuracy of 98% with classiﬁcation score of

{precision, recall, f1-score} = {0.97, 0.98, 0.97} in

automated crop state diagnosis of corn leaves. Data

augmentation to increase the variety in the training

image set also helped in extraction of robust features,

thus resulting in better classiﬁcation accuracy on dif-

ferent images. Hence, along with the basic classiﬁca-

ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods

898

tion techniques, using the features from CNN trained

on augmented data, and ensembling with AdaBoost

on similar looking classes seems to be a promising

solution to automate the crop health diagnosis. This

would help farmers and agriculture experts to take

faster actions. Appropriate models can be selected

based on the accuracy and computational efﬁciency.

REFERENCES

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z.,

Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin,

M., Ghemawat, S., Goodfellow, I., Harp, A., Irving,

G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kud-

lur, M., Levenberg, J., Man

e, D., Monga, R., Moore,

S., Murray, D., Olah, C., Schuster, M., Shlens, J.,

Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Van-

houcke, V., Vasudevan, V., Vi

egas, F., Vinyals, O.,

Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and

Zheng, X. (2015). TensorFlow: Large-scale machine

learning on heterogeneous systems. Software avail-

able from tensorﬂow.org.

Breiman, L. (2001). Random forests. Mach. Learn.,

45(1):5–32.

Chollet, F. (2016). Xception: Deep Learning with

Depthwise Separable Convolutions. arXiv preprint

arXiv:1610.02357.

Cortes, C. and Vapnik, V. (1995). Support-vector networks.

Machine learning, 20(3):273–297.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei,

L. (2009). Imagenet: A large-scale hierarchical image

database. In IEEE Conference on Computer Vision

and Pattern Recognition (CVPR), pages 248–255.

Freund, Y. and Schapire, R. E. (1997). A decision-theoretic

generalization of on-line learning and an application

to boosting. Journal of Computer and System Sci-

ences, 55(1):119 – 139.

Fujita, E., Kawasaki, Y., Uga, H., Kagiwada, S., and Iy-

atomi, H. (2016). Basic investigation on a robust and

practical plant diagnostic system. In 15th IEEE Inter-

national Conference on Machine Learning and Appli-

cations (ICMLA), pages 989–992.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-

ual learning for image recognition. In Proceedings of

the IEEE Conference on Computer Vision and Pattern

Recognition, pages 770–778.

Hearst, M. A. (1998). Support vector machines. IEEE In-

telligent Systems, 13(4):18–28.

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D.,

Wang, W., Weyand, T., Andreetto, M., and Adam,

H. (2017). Mobilenets: Efﬁcient convolutional neu-

ral networks for mobile vision applications. CoRR,

abs/1704.04861.

Hughes, D. P. and Salath

e, M. (2015). An open ac-

cess repository of images on plant health to en-

able the development of mobile disease diagnostics

through machine learning and crowdsourcing. CoRR,

abs/1511.08060.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-

ageNet Classiﬁcation with Deep Convolutional Neu-

ral Networks. In Advances in Neural Information Pro-

cessing Systems 25, pages 1097–1105.

Mohanty, S. P., Hughes, D. P., and Salath

e, M. (2016). Us-

ing Deep Learning for Image-Based Plant Disease De-

tection. Frontiers in Plant Science, 7:1419.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,

Thirion, B., Grisel, O., Blondel, M., Prettenhofer,

P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,

A., Cournapeau, D., Brucher, M., Perrot, M., and

Duchesnay, E. (2011). Scikit-learn: Machine learning

in Python. Journal of Machine Learning Research,

12:2825–2830.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,

Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-

stein, M., et al. (2015). Imagenet large scale visual

recognition challenge. International Journal of Com-

puter Vision, 115(3):211–252.

Sharif Razavian, A., Azizpour, H., Sullivan, J., and Carls-

son, S. (2014). Cnn features off-the-shelf: an as-

tounding baseline for recognition. In Proceedings of

the IEEE conference on computer vision and pattern

recognition workshops, pages 806–813.

Simonyan, K. and Zisserman, A. (2014). Very deep con-

volutional networks for large-scale image recognition.

arXiv preprint arXiv:1409.1556.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wo-

jna, Z. (2016). Rethinking the inception architecture

for computer vision. In Proceedings of the IEEE Con-

ference on Computer Vision and Pattern Recognition,

pages 2818–2826.

Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. (2014).

How transferable are features in deep neural net-

works? In Advances in neural information processing

systems, pages 3320–3328.

Zeiler, M. D. and Fergus, R. (2014). Visualizing and under-

standing convolutional networks. In European confer-

ence on computer vision, pages 818–833. Springer.

Identiﬁcation of Diseases in Corn Leaves using Convolutional Neural Networks and Boosting

899