Moving Horizon Planning and Control for Autonomous Vehicles with
Active Exploration and Fallback Strategies
Mohamed Soliman
1
and Rolf Findeisen
2 a
1
Laboratory for System Theory and Automatic Control, Otto-von-Guericke Universit
¨
at Magdeburg, Germany
2
Control and Cyber-physical Systems Laboratory, TU Darmstadt, Darmstadt, Germany
Keywords:
Active Exploration, Dual Control, Model Predictive Control, Obstacle Avoidance, Autonomous Vehicles.
Abstract:
Navigating autonomous vehicles within a partially known environment to achieve a specific goal is an impor-
tant yet challenging problem. It necessitates ensuring the safety of the vehicle along its trajectory, accounting
for potentially unknown obstacles while maintaining the vehicle’s ability to navigate the path at all times. Con-
ventionally, a safe path is devised based on the available offline information. This does not exploit additional
environmental information that can be obtained during movement. In a hierarchical moving horizon planning
and control framework, we recast the lower-level vehicle control problem as a dual control problem, where
the objective extends beyond merely following a given path, to include active exploration. This exploration
involves acquiring additional information to reduce the uncertainty about obstacles encountered, potentially
improving overall performance. Recognizing that active exploration can incur additional costs or lead the
vehicle into situations where obstacles impede the traveled path, we propose a fallback strategy that involves
returning to a known, possibly suboptimal, path. The approach is illustrated through simulations.
1 INTRODUCTION
The deployment of autonomous vehicles has a broad
spectrum of applications, ranging from self-driving
cars to unmanned aerial vehicles used, for example,
in search and rescue missions (de Alcantara Andrade
et al., 2019; Ibrahim et al., 2019; Nawaz et al., 2019).
To perform their tasks, these systems include path
planning and motion control components, which in
all circumstances need to ensure collision avoidance
and safety (Aggarwal and Kumar, 2020). Path plan-
ning and control are challenging in the presence of
unknown or only partially known environments. Typ-
ically, planning is based on maps and complemented
by onboard sensor data, such as camera systems or LI-
DAR. However, environmental information and maps
are often incomplete due to limited sensing capabili-
ties, such as range and field of view. This results in the
challenge of control and planning under conditions of
environmental uncertainty.
Missing environmental information often leads
to conservative behavior to ensure obstacle avoid-
ance under all conditions. This raises the ques-
tion of whether planning and control can be
a
https://orcid.org/0000-0002-9112-5946
information/perception-aware to improve perfor-
mance. In perception-aware planning, also denoted
as information-aware re-planning, the objective is to
derive a trajectory that facilitates exploration of the
environment to acquire additional information regard-
ing detected obstacles (Popovi
´
c et al., 2024). Do-
ing so potentially improves performance, enabling a
more efficient trajectory. (Palazzolo and Stachniss,
2018) introduced an online exploration-aware algo-
rithm that determines the next point to visit, where
the expected information about the uncharted region
is maximized. The work (Julian et al., 2014) strives
to optimize a mutual information reward function to
motivate the robot to explore new areas. Based on
an information-theoretic framework, Folsom et al.
(2021) employs rapidly exploring random tree algo-
rithms for the Mars helicopter to collect information
about the surface. The autonomous exploration of a
mobile robot within an environment, delineated by
an occupancy grid map, is described in (Wang et al.,
2019).
From a control point of view, the challenge of ac-
tive exploration is closely related to the concept of
dual control (Mesbah, 2018; Feldbaum, 1960). Dual
control considers that control actions impact infor-
mation about the state of the system and vice versa.
Soliman, M. and Findeisen, R.
Moving Horizon Planning and Control for Autonomous Vehicles with Active Exploration and Fallback Strategies.
DOI: 10.5220/0012996200003822
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics (ICINCO 2024) - Volume 1, pages 359-367
ISBN: 978-989-758-717-7; ISSN: 2184-2809
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
359
A
preplanned path using
available environment
infomration
optimal path using
active exploration
sensor field of view
safe free path
obstacle
initially unknown area
(Z
t
)
B
B
A
safe free path
sensor field of view
optimal path with
active exploraion
fallback trajectroy
obstacle
Figure 1: The figure above, the active exploration approach is activated that may lead to a shorter path to move from A to
B: The vehicle starts along an offline planned safe path (solid blue line), that considers the unseen area depicted in grey
as not safe, leading to a long but safe path to reach the goal B. Moving along the safe path the vehicle acquires additional
environmental information in the sensors field of view Z
t
, depicted as a green section in front of the vehicle. The active
exploration is activated by augmenting the control, not the planning layer by an exploration term that allows departing from
the preplanned path to explore and possibly find a shorter path to the goal, depicted by the solid green line. The figure below
is the case where the fallback strategy is activated as the exploration steers the vehicle to a situation where it is obstructed by
obstacles.
Achieving a balance between enhancing system infor-
mation/environmental information and fulfilling the
overarching control objective becomes important. Al-
though exploration can potentially contribute to im-
proving the overall objective, for example, if the au-
tonomous vehicle reaches its destination in a shorter
time, it is essential to recognize the potential increase
in costs by prolonged exploration. In addition, there
exists the risk that obstacles obstruct the movement
of the autonomous vehicle. Therefore, it is crucial
to integrate a fallback strategy designed to monitor
exploration cost and environmental information gain
and serve as an emergency mechanism, directing the
autonomous vehicle along a safe path once the explo-
ration becomes expensive or if obstacles obstruct the
vehicle. A series of works have tackled the task of
active exploration and control, and fallback options.
In (Xue et al., 2018), an approach is introduced to
address sensor failure, i.e. loss of environmental in-
formation, for unmanned vehicles, ensuring vehicle
safety. (Genin et al., 2023) implemented a defen-
sive fallback controller to improve the overall safety
of autonomous vehicles in cases where the primary
controller misjudges the potential risk of pedestrian
collisions. In (Sinha et al., 2023), a fallback safety
controller is introduced to protect autonomous vehi-
cles during a perception model failure, thus improv-
ing overall vehicle safety. (Soliman et al., 2022) intro-
duced an active planning and control framework for-
mulated as a dual optimal control problem to improve
overall planning and control.
We consider autonomous ground vehicles
equipped with onboard sensors with a limited sensor
field of view, denoted Z
t
, that is, the system has
limited sensing capabilities; see Figure1.
We assume that an offline planned path is avail-
able based on the beforehand available environmen-
tal information. We outline a hierarchical moving
horizon planning and control strategy, where the ex-
ploration is performed in the low-level vehicle con-
troller, not the planning level. Although not perform-
ing a global perception-aware planning might limit
the achievable performance, doing so simplifies com-
putations and allows us to consider the information
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
360
gain during control from a dual-control perspective.
We propose that the autonomous vehicle follows
the preplanned path passively; see Figure 1; without
replanning or exploring until the sensors reach an un-
known region or detect an obstacle. Upon detection
of an obstacle, the lower-level controller explores the
possible for a shorter way to reach the goal, while
reducing the uncertainty about the obstacle. This is
achieved by integrating an exploration objective into
the moving horizon optimization task for control. If
exploration becomes too expensive or the vehicle be-
comes stuck, a fallback mechanism is used, return-
ing the autonomous vehicle to the safe path to reach
the final destination; see Figure 1. The remainder of
the paper is structured as follows. In Section 2, we
present the problem formulation. Section 3 outlines
the hierarchical planning and control scheme, includ-
ing the lower-level exploration controller. Section 4
introduces the fallback strategy. The effectiveness of
the our approach is demonstrated in Section 5, before
summarizing the findings in Section 6.
2 PROBLEM FORMULATION
We consider an autonomous vehicle equipped with
onboard sensors that moves in a partially known envi-
ronment with static obstacles; see Figure 1. The pos-
sibly nonlinear vehicle dynamics are given in discrete
time:
x
k+1
= f (x
k
,u
k
), (1)
where x R
n
x
and u R
n
u
represent the vehicle states
and control input, and f : R
n
x
× R
n
u
R
n
x
. The state
vector x
k
=
p
k
T
,···
T
contains the center of mass co-
ordinates p
k
R
n
, orientation, and velocities, pitch,
etc. The autonomous vehicle should move from an
Initial position A to a Final position B. We assume
that a safe path from A to B, based on the offline avail-
able environment avoiding unknown regions is avail-
able, see Figure1.
2.1 Planning in the Sensor Field of View
We assume that the autonomous vehicle is equipped
with an onboard sensor that has a limited field of
view, Z
t
see Figure 1. For simplicity, we assume that
this field of view is shaped as an ellipsoidal segment,
which is common for many sensor systems such as
LIDAR, radar, and cameras. At each sampling time,
t, the sensors onboard capture new information which
is used for control or replanning of the path. Often,
a hierarchical control and planning scheme is used,
intertwining the planning and control problem. We
propose that, starting at point A, the preplanned safe
path based on the offline map data is used by a model
predictive path following controller (Matschek et al.,
2019) to ensure the vehicle follows the path until the
sensor’s field of view, Z
t
, encounters unknown re-
gions or an unknown obstacle. At that point, a low-
level exploratory controller is activated to explore the
unknown region, potentially finding a more optimal
path while considering the sensor field of view and
the available map information, see Figure 2. Note
that in principle, ‘global’ path planning could be per-
formed whenever new environmental information is
encountered to find a new optimal path. However, this
approach is computationally expensive and often can-
not be performed on the vehicle itself due to hardware
limitations. Therefore, we propose online or sensor-
based path planning when new information becomes
available, with the aim of locally planning a safe path
within the sensor’s limited field of view. Upon detect-
ing an obstacle O
i
, where i {1, . . . , N
o
} and N
o
is the
number of obstacles, the local planner devises a safe
path within the field of view.
Furthermore, not that we do not focus on uncer-
tainties in the system dynamics or external distur-
bances acting on the vehicle. To ensure safety in
these cases, one could perform an additional con-
straint backoff, that is, add a safety region around the
obstacle (Soliman et al., 2022) and/or use tube-based
predictive control techniques.
Performing path planning over the field of view
is still generally computationally challenging for au-
tonomous systems. Thus, we propose to use a hi-
erarchical approach, where the low-level controller
”hides” the systems’ nonlinearities and uncertainties
and performs the exploration, while the high-level
planner uses a linear system model and is formulated
as a mixed-integer optimization problem looking over
a prediction horizon N
p
covering the field of view
Soliman et al. (2022). The mixed-integer formulation
using the linear system model is formulates as:
min
u
p
,d
t+N
p
k=t
p
p
k
p
p
B
+ u
p
k
(2a)
s.t. x
p
k+1
= Ax
p
k
+ Bu
p
k
, x
p
t
= ˆx(t), (2b)
E
i,t
p
p
k
e
i,t
+ M
big
(1 d
i,k
), (2c)
Z
t
p
p
k
z
t
, (2d)
A
in
x
p
k
b
in
, (2e)
N
e
k=1
d
i,k
= 1 (2f)
Here, x
p
k
denotes the predicted states of the linear
model. ˆx(t) is the measured system state at time t
Moving Horizon Planning and Control for Autonomous Vehicles with Active Exploration and Fallback Strategies
361
Preplanned path
Path following controler
Update field of view
Sesnor-based path planner
Update field of view
Explorative controller
no explored area
no new obstacle
explored area/obstacle
in field of view
safe exploration area in
field of view
exploration cost exceed/
vehicle block
Preplanned path
return/follow to goal
Figure 2: The global/offline planner plans a safe path based on the available environment information. The vehicle follows the
preplanned path till unexplored area is detected by the onboard sensor field of view. The online/sensor-based planner plans a
safe path within the current field of view and the dual explorative controller is active to explore the detected region/obstacle.
When the vehicle is obstructed by other obstacles or the exploration cost is exceeded, the proposed fallback controller is
activated to return the vehicle to the preplanned path till reach the goal.
and u
p
k
are the system inputs. The safety of the ve-
hicle is guaranteed by obstacle avoidance constraints
(2c) with time-varying matrices of appropriate dimen-
sions E, using the so-called big M formulation; see,
e.g., (Williams, 2013). e
i,t
represents the detected ob-
stacle i at time t. The predicted trajectory must be
within the current field of view in (2d). In (2e) system
constraints are considered. The binary variable con-
straint is considered in (2f). Solving (2) leads to an
optimal reference x
t
=
h
x
p
t
T
,...,x
p
t+N
p
T
i
T
and a se-
quence of binary variables d
, related to the active ob-
stacle constraints. Both are exploited in the low-level
exploratory controller. To incentivize the autonomous
vehicle to explore the environment/obstacle, the de-
tected obstacle is expressed by a virtual linear system
subject to uncertainty. The initial condition of the vir-
tual system ˆw
t
is sent from the planner layer to the ex-
ploratory controller. To obtain this information, first,
all intersection points of Z
t
, and the detected obstacle
G
i,t
are determined, then the intersection point close
to target B is selected; see Figure 3.
3 EXPLORATIVE LOW-LEVEL
DUAL CONTROLLER
Encouraging the autonomous vehicle to collect infor-
mation proactively about obstacles and explore its en-
vironment can improve control performance and over-
all objectives, such as reducing travel time to its des-
tination (Soliman et al., 2022). Therefore, the low-
level controller is tasked with incentivizing the au-
tonomous vehicle to explore the environment in the
safe field of view. To do so, a heuristic function is in-
corporated into the cost that encourages the controller
to increase the information content or reduce the un-
certainty, i.e., gain more information about the part of
the obstacle behind the field of view. Specifically, the
controller uses the reference trajectory x
t
, as well as
a time-varying convex set L
t
that describes a convex
obstacle-free region in the field of view; see Figure 3.
If the vehicle model is perfect and the measurement
uncertainty is negligible, the low-level controller can
move freely within the set L
t
, as this avoids collisions.
A possible safe convex set L
t
can be obtained from the
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
362
B
A
p
x
p
y
w
1
t
kwown
obstacle (G
t
)
Convex set (L
t
)
w
2
t
Unknown
obstacle O
i
B
A
p
x
p
y
kwown
obstacle
Convex set
w
1
t
w
2
t
(a) (b)
Figure 3: Two examples for the field of view and a convex subset of the safe convex set L
t
which is obstacle free. (a) shows
the resulting set L
t
for a single obstacle in case of a field of view resulting from a camera sensor. (b) shows the resulting set
L
t
for a single obstacle in case of a field of view resulting from a LIDAR sensor. Upon detection of obstacles, the intersection
points between the field of view and the known obstacle G
t
are determined, and the point near the destination point B denoted
w
1
t
or w
2
t
is sent to the controller. The safe convex set L
t
(in dark red) is the intersection between the current field of view and
the binary variable d
sent by the planner.
solution of (2) as:
L
t
:=
p R
3
, such that
E
i,t
p e
i,t
+ M
big
(1 d
i,t
),
Z
t
p z
t
,
i [1,...,N
o
].
(3)
Since d
i,k
can change over time k {t,...,t + N
p
},
there might exist several obstacles-free sets. Since the
current position of the vehicle lies within the convex
set, we take the convex set found at time t.
We are interested in improving the information,
reducing the uncertainty, of the obstacle edge w
t
close
to the destination point B; see Figure 3, as this rep-
resents’ the point where the uncertainty of the unseen
obstacles “starts”. We will exploit this information in
the low-level controller during the exploration task to
decrease the uncertainty and gain more information
about the obstacle. In our scenario, unseen parts of
the obstacles are characterized by uncertainty, repre-
sented by the edges of the detected obstacle w
t
that
follow a virtual linear system influenced by Gaussian
noise such that:
w
k+1
= A
w
w
k
+ B
w
ν
k
, ν
k
N (0, Q
k
). (4)
Here w
k+1
R
3
is the predicted obstacle edge with
initial condition w
t
= ˆw
t
received from the high-level
planner. The uncertainty associated with the edge of
the obstacle, is characterized by the covariance matrix
Q
k
. It is inherently related to vehicle dynamics and
thus influences sensor information. The propagation
of ˆw is calculated using its mean and variance as:
µ
k
= ˆw
t
, (5a)
σ
k+1
= g(σ
k
,Q(x,u)). (5b)
The function g in (5b) serves as a general estima-
tor, capturing the influence of the predicted control
signal on the propagation of uncertainty. It is impor-
tant to note that we have assumed that the mean value
of the dynamics of the obstacle remains constant at
ˆw
t
, while its uncertainty fluctuates with changes in the
state of the system. The evolution of the autonomous
vehicle states directly impacts the evolution of envi-
ronmental uncertainty, denoted Q(·). The propaga-
tion function of environmental uncertainty can be ex-
pressed by uncertainty based on angle or distance; de-
tails can be found in (Soliman et al., 2022). Through
the inclusion of an excitation term in the objective
function, control signals facilitate not only control ac-
tions but also probing actions. Therefore, the over-
all dual optimal control problem in the low-level con-
troller can be expressed as follows:
J(x
k
,u
k
,x
s
,σ
k
) :=
k=t+N
p
k=t
W
1
F
1
(x
c
k
,u
c
k
,x
p,
k
)
+W
2
F
2
(σ
k
)
+W
3
F
3
(x
c
t+N
p
,u
c
t+N
p
).
(6)
Here F
1
penalizes the states with respect to the trajec-
tory x
t
given by planner, F
2
is an exploration function
that can be expressed as the trace of the covariance
matrix (tr(σ
k+1
)) and F
3
is a terminal penalty func-
tion. Furthermore, the weighting matrices W
1
and W
2
Moving Horizon Planning and Control for Autonomous Vehicles with Active Exploration and Fallback Strategies
363
represent the trade-off between the control task ob-
jective and the exploration of the uncertainty learning
objective, while W
3
is the terminal penalty weighting
matrix.
Notably, the hierarchical controller optimally
guides the autonomous vehicle, ensuring vehicle
safety, while exploration is carried out within the safe
convex set L
t
defined by the high-level planner. The
resulting exploratory low-level controller is formu-
lated on a moving horizon:
min
u
c
J(x
k
,u
k
,x
s
,σ
k
) (7a)
s.t. x
c
k+1
= f (x
c
k
,u
c
k
), x
c
t
= ˆx(t), (7b)
w
k+1
N (µ
k
, σ
k
), (7c)
σ
k+1
= g(σ
k
,Q
k
), σ
k
= σ
t
, (7d)
Q
k
= e(ψ
k
,θ
t
), (7e)
µ
k
= ˆw
t
, (7f)
x
c
k+1
L
t
, u
c
k
U (7g)
Here, u
c
= {u
c
t
,...,u
c
t+N
p
} is the sequence of control
actions. Only the first piece of the optimal control se-
quence is applied to the system, and the optimization
is repeated. Equation (7b) represents the dynamics of
the vehicle used in the low-level control layer using a
nonlinear function f : R
n
x
× R
n
u
R
n
x
. In particu-
lar, the predicted control trajectory affects the propa-
gation of uncertainty through the constraints (7d) and
(7e) where ψ represents the heading angle of the au-
tonomous vehicle while θ
t
represents the angle of the
detected obstacle edge w.r.t. the fixed ground frame.
In (7g), the system states lay in the safe convex set
L
t
obtained from the planning. Furthermore, control
signals u
c
k
are limited to within the set U, which can
be chosen as a trade-off between the allowed level
of aggressiveness and the smoothness of the trajec-
tory(Berntorp et al., 2018).
Frequently, exploring increases the overall cost,
mainly because the autonomous vehicle deviates from
the optimal path. Therefore, there is a trade-off be-
tween strictly adhering to the safe trajectory provided
by the high-level planner and engaging in exploration.
In scenarios where sensing capabilities are lim-
ited, e.g., a limited field of view and/or range, sensors
may only provide partial information about obstacles.
In such instances, the overall cost may become costly
in comparison to the planned offline path available.
Furthermore, the new path could be obstructed during
exploration execution; see Figure 1. Therefore, we
propose a fallback strategy that leads the vehicle back
to the safe path to ensure the completion of the task,
as outlined in the next section.
4 FALLBACK STRATEGY
If the cost for exploration becomes too high, or the
system reaches a locked position, the fallback mode
is activated; see Figure 2.
As a fallback strategy, we propose using an MPC
trajectory tracking formulation that takes advantage
of the off-line planned path and / or the safe route ex-
plored. To do so, the vehicle should be able to follow
the path previously implemented. It is formulated as
follows:
min
u
J
t
(r
t
(·),u
t
(k), x(k),u(k)) (8a)
s.t. x
c
k+1
= f (x
c
k
,u
c
k
), x
c
t
= ˆx(t), (8b)
y
k
= r
t
(k) h(x
k+1
), (8c)
u
k
= u
t
(k) u
k
, (8d)
x
k+1
X , u
k
U, h(x
k+1
) Y (8e)
Here, the path to follow enters via the output and input
error dynamics (8c) and (8d). Furthermore, the refer-
ence r
t
(k) depends on time, that is, a new reference
is available to the controller at each time step. This
implies that the controller should drive the system to
be in a specific state at specific times while respecting
the state and control input constraints in and the non-
linear vehicle dynamics expressed in (8b) (Matschek
et al., 2019). The reference trajectory can be designed
prior to the execution of the autonomous vehicle mo-
tion and/or any additional waypoints explored by the
vehicle during the exploration process facilitated by
the dual controller. Note that the pre-planned path
or trajectory incorporates both the offline path and the
path explored in the exploration phase of the low-level
controller.
5 CASE STUDY
We investigate a mobile robotic system that navi-
gates a densely populated office environment. The
robot has partial offline map information; see Fig-
ure 4; where regions without information are gray-
shaded. Based on the map, an offline path planning
algorithm leads to a safe path, incorporating obstacle
data from the known areas, e.g., obstacles’ positions
and geometries, while circumventing the unexplored
gray region. The robot has limited sensing capabili-
ties, e.g., limited field of view and range. The mobile
robot can be mathematically represented by a kine-
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
364
B
Figure 4: Employed planning and control strategy with low-level exploration controller if an unexplored area is encountered.
The exploratory dual-control low-level controller is activated once the field of view enters the uncharted area. While the robot
explores the area, it cannot find a shorter path as the robot is blocked by other obstacles. The fallback strategy is activated,
leading the robot back to the safe path to reach the goal.
0
50
100
150
2
1
0
1
2
Velocity [
m
s
]
0
50
100
150
0.4
0.2
0
0.2
0.4
Time [sec]
Steering angle [rad]
Figure 5: Resulting velocity and steering angle for the ve-
hicle. Red areas represent the vehicle’s mechanical con-
straints. The explorative low-level dual controller fully
exploits the vehicle’s capacity while respecting the au-
tonomous vehicle’s mechanical limitations.
matic bicycle model as follows (Jazar, 2017):
˙p
x
= vcos(ψ). (9a)
˙p
y
= vsin(ψ). (9b)
˙
ψ = vtan(δ)/L. (9c)
˙v = u
1
. (9d)
˙
δ = u
2
. (9e)
Equations (9a) and (9b) represent the dynamics of the
center of mass of the vehicle while the heading an-
gle dynamics is given by (9c). The control inputs u
1
and u
2
are the acceleration and steering angle rates,
respectively. As shown, the autonomous vehicle fol-
lows the preplanned offline path until the onboard
sensor detects the presence of the gray region. Sub-
sequently, an online path planning and control ap-
proach is adopted. Upon obstacle detection, an ac-
tive exploration scheme is initiated, in which the vehi-
cle accelerates to its maximum capabilities within the
safe convex exploration set provided by the planner.
Due to limited sensing capabilities, the autonomous
vehicle could be obstructed by other obstacles or the
exploration cost remains constant, indicating that no
new information on the obstacle is acquired see Fig-
ures 4 and 6. The fallback trajectory tracking con-
troller is then activated, utilizing the safe explored
path and the preplanned path to guide the autonomous
vehicle safely to the goal point while adhering to the
vehicle’s dynamic constraints (see Figure 5).
6 CONCLUSIONS
Navigating autonomous vehicles in partially or en-
tirely unknown environments presents a significant
challenge, requiring the controller to ensure the ve-
hicle’s safety while completing the designated task.
Active exploration has been recognized as a method
to enhance performance. We propose utilizing on-
board sensor information within the vehicle’s field of
view to explore regions for which no information is
available offline. To achieve this, we have integrated
active exploration within a hierarchical moving hori-
Moving Horizon Planning and Control for Autonomous Vehicles with Active Exploration and Fallback Strategies
365
A
B
Figure 6: Employed planning and control strategy with low-level exploration controller if an unexplored area is encountered.
The exploratory dual-control low-level controller is activated once the field of view enters the uncharted area. While the robot
explores the area, the exploration costs increases as the robot cannot find a free path to the goal point. Fallback controller is
activated leading the robot back to the safe path to reach the goal.
0
50
100
150
2
1
0
1
2
Velocity [
m
s
]
0
50
100
150
0.4
0.2
0
0.2
0.4
Time [sec]
Steering angle [rad]
Figure 7: Resulting velocity and steering angle for the ve-
hicle. Red areas represent the vehicle’s mechanical con-
straints. The explorative low-level dual controller fully
exploits the vehicle’s capacity while respecting the au-
tonomous vehicle’s mechanical limitations
zon planning and control framework. For safe opera-
tion, active exploration is executed by the low-level
controller, which can deviate from the preplanned
path upon receiving new information from the on-
board sensors about unknown areas. When obstacles
are encountered, the additional information obtained
through active exploration is used to reduce object un-
certainties. We have also introduced a fallback strat-
egy that activates if exploration becomes prohibitively
expensive or fails. Through simulation, we demon-
strate the application of the proposed fallback con-
troller to a mobile robot navigating a cluttered envi-
ronment, with results highlighting its efficacy.
REFERENCES
Aggarwal, S. and Kumar, N. (2020). Path planning tech-
niques for unmanned aerial vehicles: A review, so-
lutions, and challenges. Computer Communications,
149:270–299.
Berntorp, K., Hoang, T., Quirynen, R., and Di Cairano, S.
(2018). Control architecture design for autonomous
vehicles. In 2018 IEEE Conference on Control Tech-
nology and Applications (CCTA), pages 404–411.
IEEE.
de Alcantara Andrade, F. A., Reinier Hovenburg, A.,
Netto de Lima, L., Dahlin Rodin, C., Johansen,
T. A., Storvold, R., Moraes Correia, C. A., and Bar-
reto Haddad, D. (2019). Autonomous unmanned
aerial vehicles in search and rescue missions using
real-time cooperative model predictive control. Sen-
sors, 19(19):4067.
Feldbaum, A. A. (1960). Dual control theory. i. Avtomatika
i Telemekhanika, 21(9):1240–1249.
Folsom, L., Ono, M., Otsu, K., and Park, H. (2021).
Scalable information-theoretic path planning for a
rover-helicopter team in uncertain environments. In-
ternational Journal of Advanced Robotic Systems,
18(2):1729881421999587.
Genin, D., Dietrich, E., Kouskoulas, Y., Schmidt, A., Ko-
bilarov, M., Katyal, K., Sefati, S., Mishra, S., and
Papusha, I. (2023). A safety fallback controller for
improved collision avoidance. In 2023 IEEE Inter-
national Conference on Assured Autonomy (ICAA),
pages 129–136.
Ibrahim, M., Matschek, J., Morabito, B., and Find-
eisen, R. (2019). Hierarchical model predictive con-
ICINCO 2024 - 21st International Conference on Informatics in Control, Automation and Robotics
366
trol for autonomous vehicle area coverage. IFAC-
PapersOnLine, 52(12):79–84.
Jazar, R. N. (2017). Vehicle dynamics: theory and applica-
tion. Springer.
Julian, B. J., Karaman, S., and Rus, D. (2014). On mutual
information-based control of range sensing robots for
mapping applications. The International Journal of
Robotics Research, 33(10):1375–1392.
Matschek, J., B
¨
athge, T., Faulwasser, T., and Findeisen,
R. (2019). Nonlinear predictive control for trajectory
tracking and path following: An introduction and per-
spective. In Handbook of Model Predictive Control,
pages 169–198. Springer.
Mesbah, A. (2018). Stochastic model predictive control
with active uncertainty learning: A survey on dual
control. Annual Reviews in Control, 45:107–117.
Nawaz, H., Ali, H. M., and Massan, S. (2019). Applications
of unmanned aerial vehicles: a review. Tecnol. Glosas
Innovaci
´
on Apl. Pyme. Spec, (2019):85–105.
Palazzolo, E. and Stachniss, C. (2018). Effective explo-
ration for mavs based on the expected information
gain. Drones, 2(1):9.
Popovi
´
c, M., Ott, J., R
¨
uckin, J., and Kochenderfer, M. J.
(2024). Learning-based methods for adaptive infor-
mative path planning. Robot. Auton. Syst.
Sinha, R., Schmerling, E., and Pavone, M. (2023). Closing
the loop on runtime monitors with fallback-safe MPC.
In 2023 62nd IEEE Conference on Decision and Con-
trol (CDC), pages 6533–6540.
Soliman, M., Morabito, B., and Findeisen, R. (2022). To-
wards safe exploration for autonomous vehicles using
dual model predictive control. IFAC-PapersOnLine,
55(27):387–392.
Wang, C., Chi, W., Sun, Y., and Meng, M. Q.-H. (2019).
Autonomous robotic exploration by incremental road
map construction. IEEE Transactions on Automation
Science and Engineering, 16(4):1720–1731.
Williams, H. P. (2013). Model building in mathematical
programming. John Wiley & Sons.
Xue, W., Yang, B., Kaizuka, T., and Nakano, K. (2018). A
fallback approach for an automated vehicle encoun-
tering sensor failure in monitoring environment. In
2018 IEEE Intelligent Vehicles Symposium (IV), pages
1807–1812.
Moving Horizon Planning and Control for Autonomous Vehicles with Active Exploration and Fallback Strategies
367