Visual Servoing for Vine Pruning Based on Point Cloud Alignment

Fadi Gebrayel, Martin Mujica and Patrick Dan

LAAS-CNRS, Universit

e de Toulouse, CNRS, UPS, Toulouse, France

Keywords:

Visual Servoing, Computer Vision, ICP, Vine Pruning, Agriculture Robotics.

Abstract:

This paper addresses the challenge of vine pruning, a crucial and laborious task in agriculture, using robotic

technologies and vision based feedback control. The complex structure of vines makes visual servoing difﬁ-

cult due to challenges in 3D pose estimation and feature extraction. A novel approach to vision based vine

pruning is proposed, based on the combination of Iterative Closest Point (ICP) point-cloud alignment and

position-based visual servoing (PBVS). Four ICP variants are compared within PBVS in vine pruning sce-

narios: standard ICP, Levenberg–Marquardt ICP, Point-to-Plane ICP, and Symmetric ICP. The methodology

includes a dedicated ICP initial guess to improve alignment speed and accuracy, as well as a procedure for

generating reference point clouds at pruning locations. Live experiments were conducted on a Franka Emika

manipulator equipped with a stereo camera, involving three real vines under laboratory conditions.

1 INTRODUCTION

Pruning is a critical task in vineyard management,

which determines the quality and yield of the harvest.

Traditionally, this task is laborious, time-consuming,

and uncomfortable for workers, as it must be per-

formed in cold weather and requires awkward pos-

tures. While robotic solutions have been proposed

to pruning, fully autonomous systems have yet to re-

place manual pruning. Indeed, the complexity and

variability of the structure of a vine hinders its per-

ception as well as the planning and control of the mo-

tion of the cutting tool. To address this challenge,

robots must integrate control systems with feedback

from exteroceptive sensors, so as to adapt to environ-

mental changes such as unexpected relative motion of

the vine (due to vibrations, collisions, etc.).

The typical workﬂow of an autonomous vine

pruning robot consists of the following cycle: au-

tonomous movement to the next vine stock; percep-

tion of the plant, geometric and semantic modeling;

determination of pruning poses; collision-free plan-

ning and control of the cutting device(s) motion to-

wards these poses; cutting operation. However, in

real-life conditions, vibrations or collisions with un-

perceived vine elements are likely, requiring sensor-

based feedback (e.g., visual or force) to ensure suc-

cessful cutting.

In research laboratories, several initiatives have

emerged towards vine pruning robots. The proto-

type (Botterill et al., 2017) entails advanced computer

vision, machine learning and motion, but includes no

local vision-effort control of the end-effector cutting

tool. According to the authors, tests show that the

long chain of interdependent components limits fea-

sibility. The Bumblebee robot (Silwal et al., 2022)

also features such advanced techniques, together with

a robust mechanical design. A signiﬁcant effort is put

on autonomous navigation, but no local exteroceptive

control of the pruning shears is envisaged. In (Yan-

dun et al., 2021), the robot’s reference movement is

generated by combining reinforcement learning and

inverse kinematics, yet the authors explicitly mention

that its execution is done in open-loop. Though target-

ing a less integrated solution, (Katyara et al., 2021)

presents perception, cutting points learning, motion

planning, but also addresses vine rods dynamics, ad-

mittance control and passivity maintenance.

As pruning is a complex task from a motion plan-

ning viewpoint, (You et al., 2020) analyzes ways

of reducing planning time and sequencing the cut-

ting points. Among the used common control laws,

vision-based or vision-force feedback control strate-

gies have very recently been considered in isola-

tion, for fruit pruning (Zahid et al., 2021) or harvest-

ing (You et al., 2022; Li et al., 2022). The harvesting

technique (Gursoy et al., 2023) uses dual robotic arms

guided by visual data to detect fruits and tree trunk.

Joint velocity control is implemented through hier-

archical quadratic programming, facilitating collision

416

Gebrayel, F., Mujica, M. and Danès, P.

Visual Servoing for Vine Pruning Based on Point Cloud Alignment.

DOI: 10.5220/0013015200003822

In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024) - Volume 1, pages 416-423

ISBN: 978-989-758-717-7; ISSN: 2184-2809

avoidance with previously identiﬁed trunks. How-

ever, no visual information is provided, which could

be used for feedback control. Besides, a visual servo-

ing approach is proposed in (Mehta et al., 2014) for

harvesting tasks, which features a nonlinear controller

with fruit motion compensation.

The recent work (Zhang et al., 2020) incorporates

iterative closest point (ICP) based point cloud reg-

istration into visual servoing. Though some robust-

ness to illumination changes is obtained, the consid-

ered environments are large and dense, in contrast to a

pruning scene. Therefore, the present work evaluates

ICP based visual servoing strategies for vine prun-

ing. The contributions are the following: endow ICP

and position based visual servoing (ICP-PBVS) with

an adaptive initial guess and real time capabilities to

overcome the challenges of 3D pose estimation in the

pruning context.; provide a thorough comparison of

the inﬂuence of some ICP techniques on servoing per-

formance (Besl and McKay, 1992; Fitzgibbon, 2003;

Chen and Medioni, 1992; Rusinkiewicz, 2019); pro-

pose an approach to the generation of a synthetic ref-

erence point-cloud, as well as an experimental eval-

uation on complex vine pruning tasks. Importantly,

our approach is well suited to a broader class of agri-

cultural tasks, where mobile robotic manipulators (on

tractors, etc.) must accurately reach cutting/gripping

points.

In the sequel, an accurate 3D model of the vine

is available a priori (through point-cloud registration

(Choi et al., 2015) or along Section 4.4.3). It in-

cludes the reference (desired) pruning poses (e.g.,

through Simonit & Sirch vine pruning rules). The

aim is to successfully position the robot pruner end-

of-arm tooling (EOT) in spite of likely disturbances:

vine motion due to past small collisions, platform vi-

brations, among others. Nonrestrictively, local vine

rigidity around pruning poses is assumed.

Next Sections 2 and 3 provide theoretical back-

ground on ICP-PBVS and expand the algorithm. Sec-

tion 4 outlines and analyzes experimental results

on a Franka Emika Panda 7DOF manipulator robot

equipped with a eye-in-hand stereo camera. Section 5

concludes the paper.

2 FUNDAMENTALS

2.1 Visual Servoing

Visual servoing can be stated as the regulation to zero

of the error (Chaumette and Hutchinson, 2006)

e(t) = s(m(t), a) − s

∗

, (1)

with: s the vector of features used for feedback; s

∗

its reference value; m(t) a vector of image measure-

ments (e.g., from a eye-in-hand camera); a a vector

of additional known parameters (camera, objects. . . ).

When s is the camera-to-target relative pose, the con-

trol strategy is termed “position based visual servo-

ing” (PBVS). It requires a pose estimation algorithm.

Conversely, visual servoing can be speciﬁed without

resorting to pose estimation, by building s and s

∗

with

the current and reference values of 2D features ex-

tracted from the image. The consequent “image based

visual servoing” (IBVS) is known to be computation-

ally cheaper and to offer higher robustness to calibra-

tion errors. Yet, its design may be difﬁcult, e.g., when

feature extraction/matching is troublesome.

Consider the positioning of a robot pruner EOT

using visual feedback from a eye-in-hand camera.

Dynamics of the camera (or EOT) are often neglected,

so the control input is set to the camera velocity screw

v = [v

, ω

]

, (2)

i.e., its translation and rotation velocities (expressed

in its frame). The standard controller synthesis

method is as follows: (i) set the open-loop model

˙s = Lv, (3)

where the so-called interaction matrix L is gener-

ally parameterized by s and some 3D pose variables;

(ii) select the ideal closed-loop dynamics of e, e.g.,

the stable, ﬁrst-order linear and decoupled equation

˙e = −λe, λ ∈ R

; (4)

(iii) deduce the feedback control law

v = −λ

e, (5)

with

L a matrix gain that approximates L, and

its

pseudo-inverse. Potential pitfalls of such approaches

are well-known: for PBVS, the induced linear path

in the space of pose variables may make the targets

entailed in pose estimation leave the sensor ﬁeld-of-

view; in IBVS, the exponential decoupled conver-

gence of 2D features to their reference values may

imply unacceptable 3D motion of the camera.

The reference camera frame F

∗

, the current cam-

era frame F

, and the object frame F

are central to

classical PBVS. In the sequel,

stands for the ex-

pression in F

of the translation vector from the origin

of F

to the origin of F

, with i, j ∈ {c

∗

, c, o}. Simi-

larly,

and



0 1



stand for the rotation and

homogeneous matrices associated to the rigid trans-

form that turns F

into F

. Deﬁning (θ, u) as the (an-

gle,unit vector) pair equivalent to

∗

, one sets

s = (

∗

, θu), s

∗

= 0, e = s. (6)

Visual Servoing for Vine Pruning Based on Point Cloud Alignment

417

The poses depicted by

∗

and

must be com-

puted from visual data, in order to deduce

∗

. For

the above selection of s, the interaction matrix L

comes as (Chaumette and Hutchinson, 2006)

L =



∗

0 L

θu



; L

θu

= I

−

[u]

+(1−

sincθ

sinc

)[u]

, (7)

what leads to the conventional PBVS feedback

= −λ

∗

, ω

= −λθu, (8)

with [u]

the cross-product tensor associated to

vector u.

2.2 3D Point Cloud Alignment

Getting the rigid transformation from the reference

camera frame F

∗

to the actual camera frame F

can

be viewed as ﬁnding the relative pose that aligns the

point cloud observed at F

∗

to that observed at the F

This relative pose can be obtained through iterative

point cloud alignment methods.

Iterative Closest Point (ICP). The ICP algorithm

iteratively selects corresponding points from two

point clouds and computes the relative homogeneous

transform which minimizes the sum of squared dis-

tances between pairs of matching points. A thresh-

old limits the maximum matching distance for points

with no correspondent. This process is detailed in Al-

gorithm 1.

Inputs: point cloud A = {a

}

point cloud B = {b

}

initial guess homogeneous matrix H

Output: homogeneous matrix H that aligns

point clouds A and B

Parameters: point cloud size N

threshold threshold

H ← H

while not converged do

for j ← 1 to N do

← FindClosestPointInA(Hb

)

if ∥m

− Hb

∥ ≤ threshold then

← 1

else w

← 0;

end

H ← argmin

H) :=

∑

∥

−m

∥

end

Algorithm 1: Standard ICP algorithm.

Levenberg–Marquardt ICP. The LM-

ICP (Fitzgibbon, 2003) employs a general-

purpose nonlinear optimization solver, the Lev-

enberg–Marquardt algorithm, to minimize the

registration error J(

H) supporting Algorithm 1. This

improves the convergence speed of ICP without

increasing the computation time.

Point-to-Plane ICP. The PP-ICP (Chen and

Medioni, 1992) entails the distance from any point

to the plane tangent to the other point cloud at

its matching point. The registration error to be

minimized writes as

PP-ICP

(

H) :=

∑



− m

)



(9)

where η

stands for the surface normal at m

. This

results in improved accuracy and faster convergence.

ICP with Symmetric Objective Function. The

SYMM-ICP (Rusinkiewicz, 2019) incorporates a

symmetrized version of the point-to-plane ICP reg-

istration error. In other words, each plane entailed

in the objective function is deﬁned from the tangent

planes (or, equivalently, the surface normals) at both

points within a corresponding pair. The registration

error writes as

SYMM-ICP

(

H):=

∑



(η

a, j

+η

b, j

).(

−

−1



(10)

where

R term the translation vector and rotation ma-

trix of the homogeneous matrix decision variable

and η

a, j

, η

b, j

respectively stand for the surface normal

at points a

, b

3 ICP BASED PBVS

Accurately estimating vine shoot poses is undeniably

challenging. Irregular shoot shapes, potential occlu-

sions, and varying lighting conditions constitute sig-

niﬁcant hurdles. Moreover, the lack of specialized

branch models and the critical real-time processing

requirement intensify the challenge.

In the vein of (Zhang et al., 2020), a point cloud

based PBVS is introduced, where the ICP algorithm

estimates the pose that aligns the current and refer-

ence point clouds. Its schematic diagram is shown

on Figure 1. The ICP block compares a predeﬁned

reference point cloud with the current point cloud ob-

tained through the vision system. It outputs the re-

sulting homogeneous transformation between the ref-

erence and current camera frame, possibly using an

initial guess (detailed further) to launch the compu-

tation. The PBVS block takes this matrix and, fol-

lowing the developments in 2.1, computes the camera

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

418

velocity screw. Finally, this control signal delivered

by the outer visual servo loop is turned into the robot

end-effector velocity screw, which constitutes the set-

point of the inner robot controllers.

ICP

PBVS

Vel Control

Robot

Initial guess

Vision sensor

∗

ref. point cloud

current p oint cloud

ref

−1

Figure 1: Schematic diagram of the proposed ICP based

PBVS.

As aforementioned, the ICP algorithm requires an

initial guess to initiate the point cloud alignment. The

default implementation sets it to the identity matrix,

which is clearly nonoptimal. An online initial guess

selection method is described below, based on an ap-

proximation of the camera pose and ICP output.

Let

∗

k−1

and

∗

stand for the homogeneous

matrices from F

∗

to the camera frames F

k−1

, F

at re-

spective visual servoing iterations k −1, k. Let

k−1

and

be the homogeneous matrices from the robot

base frame F

to F

k−1

, F

. At iteration k − 1, ICP

outputs an estimate of

∗

k−1

. Similarly, robot kine-

matics enables approximations to

k−1

and

at iterations k − 1 and k. Using the same notations

for approximations and genuine values of homoge-

neous transforms, an initial guess for ICP at iteration k

comes as

∗

k−1

(

k−1

)

−1 B

. (11)

4 EXPERIMENTS ON REAL

VINES

4.1 Experimental Setup and Procedure

This section presents the experimental analysis of

point-cloud based PBVS techniques for vine prun-

ing. The setup is composed of a Franka Emika Panda

robot endowed with the ROS middleware and a eye-

in-hand Realsense D405 camera. This camera re-

lies on stereovision, and is supported by ROS. The

inner Franka controllers are implemented on a real-

time kernel at 1kHz. The outer visual feedback is

implemented in free-running mode, so its rate may

vary, e.g., depending on the ICP behavior. Real vine

stocks, with increasing complexity, are used for ex-

perimental tests. As the focus is to analyze and as-

sess the proposed visual servoing framework, the en-

vironment is controlled (stable light conditions and

incorporation of a textured background, to enhance

the point cloud quality). The system runs on an In-

tel i7 processor under Ubuntu 20.04 with ROS noetic,

PCL (Holz et al., 2015) and ViSP (Marchand et al.,

2005) libraries (Figure 2).

Figure 2: Experimental setup composed of the robot Franka

Emika, the Realsense camera D405 and one vine branch.

All point clouds are built from rectiﬁed RGB im-

ages. Their complexity is reduced by downsam-

pling. Points beyond a speciﬁed depth threshold are

removed. Irregular points are also removed by a sta-

tistical outlier ﬁlter.

In real-world scenarios, the reference (desired)

point cloud around each cutting pose would typically

be obtained as a fragment of the prior 3D model of the

vine. To simplify comparisons and focus on servoing,

it is hereafter obtained by ﬁrst positioning the camera

at the desired position.

Figure 3 illustrates an overview of the process. In

particular, Figures 3a and 3b display the starting and

reference RGB images. Importantly, these are not

used for feedback, but just illustrate the experimen-

tal conditions. In Figure 3c, the reference (obtained

in the condition of Fig. 3b) and starting (obtained in

the condition of Fig.3a) point clouds are respectively

shown in white and in color. Finally, Figure 3d shows

how the reference and current point clouds overlap

each other after convergence.

The ﬁrst experiments (Sections 4.2, 4.3) have been

run on non-dense branches (Figure 3). They share a

common point cloud and reference position. The aim

is to set a fair comparison of the ICP-PBVS meth-

ods for distinct starting camera poses. A second set

of experiments addresses denser, more complex, vine

structures (Section 4.4) as well as moving vines so as

to get closer to real-life scenarios subject to external

disturbances.

4.2 Inﬂuence of ICP Initial Guess

This section discusses the incorporation into ICP of

the initial guess suggested in Section 3. First, it has

been observed that this ICP initialization enables a

much faster outer visual based feedback loop: the re-

Visual Servoing for Vine Pruning Based on Point Cloud Alignment

419

(a) RGB Image at the initial

pose.

(b) RGB Image at the refer-

ence pose.

point clouds.

(d) Reference and current

point clouds at the end of the

test.

Figure 3: Overview the experimental conditions, with the

initial and reference images and their point clouds.

sulting increase, around 15 Hz for all proposed ICP

variations, is signiﬁcant, especially considering that

the camera frequency (i.e., the maximum admissible

outer-loop frequency) is 60 Hz. Beneﬁts include not

only real time performance, but also feedback system

accuracy or even stability. Indeed, lengthy ICP com-

putations may increase the visual servoing period,

potentially causing the robot to drift from the ideal

continuous-time closed-loop dynamics (4), especially

for dense vines. A limiting behavior may even be

reached, where the vine leaves the camera ﬁeld of

view. A comparison of a servoing task in a simple

vine with and without our proposed initial guess is

shown in Figure 4, illustrating the mentioned behav-

ior. Consequently, this adaptive ICP initial guess is

henceforth incorporated in all ICP-PBVS.

0 20 40 60 80 100

time (s)

0.00

0.02

0.04

0.06

0.08

0.10

translation error vector norm (m)

translation error vector norm graph

err_translation_ICP

err_translation_no_initial_guess

(a) Linear error

0 20 40 60 80 100

time (s)

0.0

0.1

0.2

0.3

0.4

0.5

rotation error (rad)

rotation error graph

err_rotation_ICP

err_rotation_no_initial_guess

(b) Rotation Error

Figure 4: ICP-PBVS with and without initial guess.

4.3 Comparison of Different ICP

Methods

Table 1 summarizes the quantitative evaluation of a

four-test experiment on sparsely structured branches.

Each test includes signiﬁcant translation and/or rota-

tion differences between starting and reference poses,

potentially making visual servoing fail due to differ-

ing viewpoints of the vine.

SYMM-ICP-PBVS outperforms other methods

regarding settling time in the ﬁrst test. However,

PP-ICP-PBVS demonstrates faster convergence in the

second and fourth tests. So does LM-ICP-PBVS

in the third test. Convergence failures in the third

and fourth tests show that both PP-ICP-PBVS and

SYMM-ICP-PBVS encounter difﬁculties due to sig-

niﬁcant initial rotation errors, while LM-ICP seems

sensitive to signiﬁcant translation errors. Neverthe-

less, ICP-PBVS converges in all cases.

As for accuracy, concerning the translation and ro-

tation Final Static Error (FSE), no deﬁnitive conclu-

sion can be drawn globally, as no servoing method is

uniformly better. Translation and rotation errors dur-

ing servoing are depicted in Figure 5 for all the ICP

variants in the second test case, along with the 3D tra-

jectories. The smoothest errors are exhibited by PP-

ICP-PBVS.

The drift of the induced 3D trajectories (shown on

Figure 5c) w.r.t. the theoretical straight line can be at-

tributed to unexpected abrupt errors between the ho-

mogeneous transforms estimated by ICP (or its vari-

ants) and their genuine values, which in turn lead to

abrupt changes in the camera control input. It can also

stem from a change of rate of the outer visual feed-

back: the longer the computation of the homogeneous

matrix by the ICP, the longer the zero-order-hold of

the camera velocity screw produced by the outer con-

troller, and the less suited the resulting setpoint to the

inner robot controller. The proposed ICP initial guess

reduces this problem.

(a) Linear error

(b) Rotation Error

Figure 5: Second test, non-dense branche structure: com-

parison of different ICP-PBVS methods.

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

420

Table 1: Comparative experiment campaign, involving a non-dense branch structure. For each test, the best results for each

criterion are highlighted in green.

Test 1 - inial translation error: 0.184749 (m) initial rotation error: 0.351915 (rad)

ICP LM ICP PP-ICP SYMM-ICP

Translation FSE (m) 0.003072 0.003333 X 0.002912

Rotation FSE (rad) 0.002162 0.003612 X 0.004765

Convergence time (s) 24.894065 33.757214 X 14.266417

Test 2 - initial translation error: 0.166275 (m) initial rotation error: 0.567889 (rad)

Translation FSE (m) 0.002969 0.002691 0.002524 0.002115

Rotation FSE (rad) 0.022047 0.021488 0.022144 0.021275

Convergence time 18.307688 26.528462 17.730364 18.238572

Test 3 - initial translation error: 0.034708 (m) initial rotation error: 0.550662 (rad)

Translation FSE (m) 0.002027 0.001722 X X

Rotation FSE (rad) 0.02138 0.022581 X X

Convergence time 16.85218 16.624503 X X

Test 4 - initial translation error: 0.316327 (m) initial rotation error: 0.12695 (rad)

Translation FSE (m) 0.002007 X 0.003296 0.004129

Rotation FSE (rad) 0.019513 X 0.01839 0.016704

Convergence time 8.834689 X 6.744938 7.37105

While not being the fastest method, in view of its

consistent convergence performance the ICP-PBVS

strategy has been selected for all the next experimen-

tal evaluations.

4.4 Complex Cases

4.4.1 Moving Vine Branches

As aforementioned, the interest of the approach lies

in its ability to handle unexpected variations that pro-

duce a discrepancy w.r.t. prior knowledge. So, exper-

iments have been carried out to study how ICP-PBVS

can cope with vine vibrations and unexpected move-

ments (due for instance to unexpected collisions of the

robot and the vine). Therefore, after reaching conver-

gence, the vine is abruptly moved. Nevertheless, the

camera is successfully driven to the pose where the

current and reference point clouds are aligned. This

can be seen in the error curves shown in Figure 6: sud-

den error increases (induced by sudden vine move-

ments) are followed by an exponential convergence

decay to zero. A more detailed analysis of this ex-

perimental scenario can be found in the companion

video.

4.4.2 Dense Vine Branches

After initial tests on real vines, experiments were ex-

tended to a more intricate environment characterized

by densely packed branches, as illustrated in Figure 7.

Video link

Three tests were conducted with two setups, consid-

ering the same reference point cloud (i.e., the same

ﬁnal pose), but with different initial conditions. Vine

in test 2 (Figure 7b, 7d), presents an increased com-

plexity due to the presence of numerous branches and

residual dried grapes, thereby intensifying perception

challenges. Table 2 displays the results, including

errors and convergence time, obtained from the con-

ducted tests. Moreover, Figure 8 presents the 3D tra-

jectories of the three tests in each campaign, high-

lighting the successful convergence of ICP-PBVS in

all tests from various starting positions towards the

reference point. The ﬁndings reveal acceptable errors,

notably in translation, albeit with slightly larger rota-

tion errors, particularly evident in the second dense

setup. Moreover, the observed prolonged conver-

gence time may be attributed to the fact that the con-

troller’s gain was reduced to handle the higher com-

putational demand arising from larger point clouds as-

sociated with densely branched vines. These ﬁndings

highlight the effectiveness of ICP-PBVS in challeng-

ing vine environments and its potential for pruning

(a) Linear error

(b) Rotation Error

Figure 6: Evolution of translation and rotation errors when

unexpected vine movement is introduced.

Visual Servoing for Vine Pruning Based on Point Cloud Alignment

421

tasks, while providing insight on possible improve-

ments for fast and real outdoors agricultural tasks.

(a) Test 1: RGB image at the

initial pose.

(b) Test 2: RGB image at the

initial pose.

tial point clouds.

(d) Test 2: Reference and

initial point clouds.

Figure 7: Overview of the case of dense branches tests, with

the initial and reference images and their point clouds.

Table 2: First and second experiment campaign results, in-

volving a dense branch structure.

Test with dense vine 1

Init. error (m, rad) (0.193, 0.35) (0.167, 0.566) (0.034, 0.55)

Trans. FSE (m) 0.002914 0.001775 0.003168

Rot. FSE (rad) 0.00734 0.00563 0.045401

Conv. time (s) 73.682 59.895 42.364

Test with dense vine 2

Init. error (m, rad) (0.185, 0.22) (0.20, 0.168) (0.037, 0.47)

Tran. FSE (m) 0.001577 0.001477 0.003454

Rot. FSE (rad) 0.011115 0.011791 0.015248

Conv. time (s) 99.234 89.485 125.401

(a) First test campaign (b) Second test campaign

Figure 8: 3D Trajectories with dense branches.

4.4.3 Sim-to-Real Reference Point Cloud

As previously stated, in real-world applications the

reference point cloud for each pruning pose must be

derived from a 3D model of the vine. Thus we built

a ﬁne 3D model of the vine before the ﬁrst pruning

process, using the same camera. Subsequently, this

3D model was included in a simulated environment

enabling the virtual selection of any pruning pose, so

that all the reference point clouds could be recover-

ered. These data were then transferred to the real-

world servoing process, to apply the same visual feed-

back control as above. Figure 9 delineates the com-

plete framework. The simulation was done in Gazebo.

The pre-built 3D vine model was integrated as a mesh.

The pruning pose was deﬁned at a reference pose 15

cm from the cutting point, where the simulated Re-

alsense D405 camera was positioned. In Figure 9, it

can be seen how the synthetic reference point cloud

is given to the online controller, and how the robot

moves to align them using ICP-PBVS, resulting in the

convergence to the cutting pose, as in the previous ex-

periments.

5 CONCLUSIONS

This work presented a working real-time ICP based

PBVS (ICP-PBVS) for vine pruning, along with its

analysis for several ICP variants. Contrary to the pro-

prioception based execution of a planned trajectory,

visual servoing can adjust the cutting tool position-

ing in real time when facing unexpected environment

changes. As the 3D trajectory followed by the prun-

ing tool is of interest in the considered agriculture

context, PBVS is preferred over IBVS. The proposed

incorporation of a relevant initial guess reduces the

ICP computation time. A side effect is the increase

of the feedback control rate, which in turns posi-

tively impacts the closed-loop system stability thanks

to shorter zero-order-hold of the control signal.

Experiments were conducted, combining four ICP

variants with PBVS. Standard ICP showed the best

performance in tests on simple vines, with further suc-

cessful experiments on moving and complex vines.

On the whole, ICP-PBVS adapts well to disturbances,

making it suitable for pruning task and broader agri-

cultural applications. However, the processing of big

point clouds and the computational cost may remain

an issue. Finally, an approach to generate the ref-

erence point cloud through simulation at the desired

pose was presented and evaluated.

Future work will focus on alternative alignment

methods to reduce computing time, improved control

laws keeping the point cloud in the camera view, and

tests in real-world outdoor conditions with more com-

plex factors like lighting and wind.

ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics

422

Figure 9: Sim to real Reference point cloud extraction scheme.

ACKNOWLEDGEMENTS

This work was supported by the “D

eﬁ Cl

e Robotique

centr

ee sur l’humain” funded by La R

egion Occitanie,

France.

REFERENCES

Besl, P. and McKay, N. (1992). Method for registration of

3-D shapes. In Sensor Fusion IV: Control Paradigms

and Data Structures, volume 1611. Spie.

Botterill, T., Paulin, S., Green, R., Williams, S., Lin, J.,

Saxton, V., Mills, S., Chen, X., and Corbett-Davies,

S. (2017). A robot system for pruning grape vines.

Journal of Field Robotics, 34(6).

Chaumette, F. and Hutchinson, S. (2006). Visual servo con-

trol. I. basic approaches. IEEE Robotics & Automation

Magazine, 13(4):82–90.

Chen, Y. and Medioni, G. (1992). Object modelling by reg-

istration of multiple range images. Image and Vision

Computing, 10(3).

Choi, S., Zhou, Q.-Y., and Koltun, V. (2015). Robust recon-

struction of indoor scenes. In Proceedings of the IEEE

conference on computer vision and pattern recogni-

tion, pages 5556–5565.

Fitzgibbon, A. (2003). Robust registration of 2D and 3D

point sets. Image and Vision Computing, 21(13-14).

Gursoy, E., Navarro, B., Cosgun, A., Kuli

c, D., and Cheru-

bini, A. (2023). Towards vision-based dual arm

robotic fruit harvesting. In 2023 IEEE 19th Interna-

tional Conference on Automation Science and Engi-

neering (CASE), pages 1–6.

Holz, D., Ichim, A., Tombari, F., Rusu, R., and Behnke,

S. (2015). Registration with the point cloud library:

A modular framework for aligning in 3-D. IEEE

Robotics & Automation Magazine, 22(4).

Katyara, S., Ficuciello, F., Caldwell, D., Chen, F., and Si-

ciliano, B. (2021). Reproducible pruning system on

dynamic natural plants for ﬁeld agricultural robots.

In Saveriano, M., Renaudo, E., Rodr

ıguez-S

anchez,

A., and Piater, J., editors, Int. Workshop on Human-

Friendly Robotics 2020. Springer.

Li, T., Yu, J., Qiu, Q., and Zhao, C. (2022). Hybrid un-

calibrated visual servoing control of harvesting robots

with RGB-D cameras. IEEE Trans. on Industrial

Electronics.

Marchand, E., Spindler, F., and Chaumette, F. (2005). ViSP

for visual servoing: A generic software platform with

a wide class of robot control skills. IEEE Robotics &

Automation Magazine, 12(4).

Mehta, S., MacKunis, W., and Burks, T. (2014). Nonlinear

robust visual servo control for robotic citrus harvest-

ing. IFAC Proceedings Volumes, 47(3):8110–8115.

Rusinkiewicz, S. (2019). A symmetric objective function

for ICP. ACM Trans. on Graphics, 38(4).

Silwal, A., Yandun, F., Nellithimaru, A., Bates, T., and Kan-

tor, G. (2022). Bumblebee: A path towards fully au-

tonomous robotic vine pruning. Field Robotics, (2).

Yandun, F., Parhar, T., Silwal, A., Clifford, D., Yuan, Z.,

Levine, G., Yaroshenko, S., and Kantor, G. (2021).

Reaching pruning locations in a vine using a deep re-

inforcement learning policy. In IEEE Int. Conf. on

Robotics and Automation (ICRA’2021), Xi’an, China.

You, A., Kolano, H., Parayil, N., Grimm, C., and David-

son, J. (2022). Precision fruit tree pruning using a

learned hybrid vision/interaction controller. In IEEE

Int. Conf. on Robotics and Automation (ICRA’2022),

Philadelphia, PA.

You, A., Sukkar, F., Fitch, R., Karkee, M., and Davidson,

J. R. (2020). An efﬁcient planning and control frame-

work for pruning fruit trees. In 2020 IEEE interna-

tional conference on robotics and automation (ICRA),

pages 3930–3936.

Zahid, A., Mahmud, M. S., He, L., Heinemann, P., Choi, D.,

and Schupp, J. (2021). Technological advancements

towards developing a robotic pruner for apple trees:

A review. Computers and Electronics in Agriculture,

189:106383.

Zhang, S., Gong, Z., Tao, B., and Ding, H. (2020). A

visual servoing method based on point cloud. In

IEEE Int. Conf. on Real-time Computing and Robotics

(RCAR’2020).

Visual Servoing for Vine Pruning Based on Point Cloud Alignment

423