Table 4: The distance errors estimation of recognised pears in different situations.
Pear condition Separating pears Aggregating pears
Light condition Low High Overall Low High Overall
¯x
n
(m) 0.013 0.020 0.017 0.012 0.023 0.018
σ
n
(m) 0.016 0.013 0.015 0.011 0.028 0.021
R
2
0.834 0.884 0.896 0.848 0.812 0.832
4 CONCLUSIONS
In this paper, we proposed a method to achieve ac-
curate recognition and position estimation in com-
plex orchard environments to reduce the grasping er-
rors caused by problems such as branch occlusion and
pear aggregation, which improved the robustness of
robots working in the complex orchard. Also, we
compared the performance of different deep learning
algorithms for the recognition of separating and ag-
gregating pears under different light intensities. The
results showed that Mask R-CNN outperforms Faster
R-CNN and YOLACT in terms of recognition accu-
racy for separating and aggregating pears under both
high and low light conditions. In further experiments,
we chose Mask R-CNN as the recognition algorithm
for pear position estimation and compared the error
mean ¯x
n
, standard deviation σ
n
, and goodness-of-fit
R
2
of separating and aggregating pears at a distance
of 0.1-0.5 m. The results showed that ¯x and σ
n
were
significantly higher for aggregated pears than for sep-
arated pears in the same cases, and R
2
reached more
than 0.8 in different cases. Therefore, this paper ex-
hibited commendable efficacy in the precise recogni-
tion and position of pears within the range of 0.1-0.5
meters. This outcome substantially bolsters the pre-
cise recognition and position estimation of pears by
agricultural fruit-picking robots.
ACKNOWLEDGEMENTS
The authors would like to thank the Tsukuba Plant
Innovation Research Center (T-PIRC), University of
Tsukuba, for providing facilities for conducting this
research in its orchards.Also,This work was supported
by JST SPRING, Grant Number JPMJS2124.
REFERENCES
Bargoti, S. and Underwood, J. (2017). Deep fruit detec-
tion in orchards. In 2017 IEEE international confer-
ence on robotics and automation (ICRA), pages 3626–
3633. IEEE.
Bechar, A. and Vigneault, C. (2016). Agricultural robots for
field operations: Concepts and components. Biosys-
tems Engineering, 149:94–111.
Bolya, D., Zhou, C., Xiao, F., and Lee, Y. J. (2019). Yolact:
Real-time instance segmentation. In Proceedings of
the IEEE/CVF international conference on computer
vision, pages 9157–9166.
Condotta, I. C., Brown-Brandl, T. M., Pitla, S. K., Stinn,
J. P., and Silva-Miranda, K. O. (2020). Evalua-
tion of low-cost depth cameras for agricultural appli-
cations. Computers and Electronics in Agriculture,
173:105394.
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE
international conference on computer vision, pages
1440–1448.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2015).
Region-based convolutional networks for accurate ob-
ject detection and segmentation. IEEE transactions on
pattern analysis and machine intelligence, 38(1):142–
158.
He, K., Gkioxari, G., Doll
´
ar, P., and Girshick, R. (2017).
Mask r-cnn. In Proceedings of the IEEE international
conference on computer vision, pages 2961–2969.
Kirkland, E. J. and Kirkland, E. J. (2010). Bilinear interpo-
lation. Advanced Computing in Electron Microscopy,
pages 261–263.
Koirala, A., Walsh, K. B., Wang, Z., and McCarthy, C.
(2019). Deep learning–method overview and review
of use for fruit detection and yield estimation. Com-
puters and electronics in agriculture, 162:219–234.
Ortiz, L. E., Cabrera, E. V., and Gonc¸alves, L. M. (2018).
Depth data error modeling of the zed 3d vision sensor
from stereolabs. ELCVIA: electronic letters on com-
puter vision and image analysis, 17(1):0001–15.
Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., and Mc-
Cool, C. (2016). Deepfruits: A fruit detection system
using deep neural networks. sensors, 16(8):1222.
Saito, T. (2016). Advances in japanese pear breeding in
japan. Breeding Science, 66(1):46–59.
Tran, T. M., Ta, K. D., Hoang, M., Nguyen, T. V., Nguyen,
N. D., and Pham, G. N. (2020). A study on determina-
tion of simple objects volume using zed stereo camera
based on 3d-points and segmentation images. Inter-
national Journal, 8(5).
Zhang, Y.-D., Dong, Z., Chen, X., Jia, W., Du, S., Muham-
mad, K., and Wang, S.-H. (2019). Image based fruit
category classification by 13-layer deep convolutional
neural network and data augmentation. Multimedia
Tools and Applications, 78:3613–3632.
Recognition and Position Estimation of Pears in Complex Orchards Using Stereo Camera and Deep Learning Algorithm
639