Figure 5: Training Metrics. (Photo/Picture credit: Original).
Figure 6: Test Metrics. (Photo/Picture credit: Original).
From Fig. 5. and Fig. 6. although model 1. has the
best performance in its training, the performance of
its test is not the highest. Model 3. shows the worst
training accuracy but has the highest accuracy rate in
the test set.
4 DISCUSSION
Although model 1. has a relatively simple structure
and shallow layers, it still achieves rather good
recognition capabilities. However, although the
author designed model 2. with deeper layers to
improve its performance, it did not perform better
recognition results. After further deepening the
number, increasing its accuracy may become more
difficult. The author tried adding more layers to
model 2., but the model's accuracy remained small
and difficult to increase. The difficulty in training the
model may be due to a vanishing gradient. To confirm
this conjecture, the author used a residual neural
network because the residual neural network can
eliminate the training difficulty caused by excessive
depth by introducing linear transformation into the
nonlinear transformation. Although model 3. had
lower accuracy and f1-score and higher loss for the
training set, model 3. achieved the best accuracy in
the test set. For recognition situations that may
actually exist rather than fixed environments, such as
training sets, model 3. performs better than the
previous two models. This conclusion shows that
using residual neural networks is more effective than
simply adding convolutional neural network layers.
5 CONCLUSION
The residual neural network has better recognition
results with more layers, showing that the residual
neural network is more effective than the typical
convolutional network. Model 2. tries to improve the
network's performance by increasing the
convolutional layers, but it doesn't work. Model 3.
with residual blocks achieves the best accuracy and
f1-score in the research. Combining the performance
of the three models for data sets with smaller pixels,
using the residual network is more effective than
increasing the number of layers of the convolutional
neural network. The accuracy rate of approximately
99.5% can satisfy the needs of most fruit
classification conditions. However, this result only
represents theoretical feasibility. For example, fruit
images shot by cameras differ from the dataset used
in this article for the models, meaning more complex
data preprocessing steps are required. The fruit
classification and identification technologies still
need supporting physical equipment and further
experimental processes to prove their practicability in
agricultural production. At the same time, limited by
the data resolution size and quantity of the data set,
the model may not meet expectations when
processing actual inputs. In order to achieve better
recognition results, larger and more comprehensive
data sets are needed.
REFERENCES
A. B. Islam, “Machine Learning in computer vision,”
Applications of Machine Learning and Artificial
Intelligence in Education, pp. 48–72, 2022.
V. G. Dhanya et al., “Deep Learning Based Computer
Vision Approaches for Smart Agricultural
Applications,” Artificial Intelligence in Agriculture,
vol. 6, pp. 211–229, Sep. 2022.
Y. Chen et al., “Plant image recognition with Deep
Learning: A Review,” Computers and Electronics in
Agriculture, vol. 212, p. 108072, Sep. 2023.
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,”
Nature, vol. 521, no. 7553, pp. 436–444, May 2015.
0,99200,99300,99400,99500,99600,99700,9980
model 1.
model 2.
model 3.
Training Metrics
Training F1-score Training Accuracy
0,9000 0,9200 0,9400 0,9600 0,9800 1,0000
model 1.
model 2.
model 3.
Test Metrics
Test F1-score Test Accuracy
Classification of Fruits Based on Convolutional Neural Networks
365