As can be seen from Figure 11, SVM+PCA performs
better for the classification of three types of fruits,
namely Cherry 1, Lemon, and Orange, and worst for
Avocado. Many fruits are incorrectly recognized as
Cherry 1 and Cocos; however, this problem does not
have a significant impact on real-world applications
because, in real-world scenarios, Cherry 1 has a
significant size difference from other fruits, which
means that it can be easily distinguished.
Figure 12: Confusion matrix of CNN (Picture credit:
Original).
As can be seen from Figure 12, the CNN model
performs very well for the fruit classification problem,
and can accurately classify fruits. The fruits in the
validation set are correctly distinguished except for
Lemon. The small part of Lemon is classified as Apple
Golden 1. This may be because Apple Golden 1 is too
similar to Lemon in some perspectives, which leads to
the model's inability to classify them accurately.
3 DISCUSSION
Both the SVM+PCA model and the CNN model
obtained relatively good results for the same dataset.
The CNN model, because of its effective capture of
local spatial features, parameter sharing, and weight
sharing properties, thus obtained up to 97% Accuracy
and possessed a more accurate classification
performance than the SVM model on the test set. The
SVM model does not have as high a classification
accuracy as the CNN model, but it also has an
Accuracy of 90%, which is a good result.
In addition, thanks to the SVM model having
fewer parameters, relying only on support vectors for
training, and using convex optimization methods
during training, the SVM+PCA model used less time
in the face of a large dataset, only 2.73s, which is
1/44th of the time used by the CNN model. Therefore,
SVM possesses higher efficiency. If the dataset is
further expanded, the advantage of the SVM model in
training time will be more obvious.
In the future, when faced with better-use
environments (e.g., supermarkets, in which there are
bright environments and fruits are not obscured),
SVM models can help people quickly classify and
recognize fruit items. In poorer environments (e.g.,
field picking environments, where fruits may be
obscured by leaves), the CNN model, with its higher
classification accuracy, can better help people classify
and recognize fruits, and even assist machine picking.
This study still has some shortcomings. The first is
that the dataset is not big enough or rich enough. This
leads to a smaller range of applicability of the trained
model. If pictures of fruits in different scenarios are
introduced, such as apples in shadows or cherries
obscured by leaves, then a better model can be
obtained. In addition, there is a shortage of hardware
equipment. In the future, better hardware equipment
can be used. This can not only cope with larger
datasets but also build more complex models.
4 CONCLUSION
The SVM model and CNN model are used for the fruit
classification problem. The comprehensive
performance of the SVM model and CNN model for
the current dataset is obtained separately through
extensive experiments to know the best performance
of these two models. For the problem of fruit
classification in a good situation, the SVM model is
more appropriate because although its classification
accuracy is slightly worse than the CNN model, the
time consumed for training is much better than the
CNN, and the accuracy of the SVM+PCA is also
acceptable.
In the future, this model can be used in robots to
help people sort fruits. In addition, this technology can
also be used in cell phones and other smart terminal
devices to help people identify unknown fruits.
REFERENCES
C. Y. Liu, L. M. Wang, X. X. Gao, Z. J. Huang, X. Zhang,
Z. P. Zhao, …, and M. Zhang (2022). Study on
DAML 2023 - International Conference on Data Analysis and Machine Learning
464