Proof: The theorem follows as an immediate application of a more general result
developed in [6]. More specifically, in [6], a similar result is shown for cost functionals
of the form
J
x
0
,w
0
,u
= E
x
0
,w
0
τ
x
0
,w
0
,u
(G)−1
X
t=0
g(x(t), v(t), w(t)),
with g ≥ ε > 0, of which (2) is a special case with g = 1. ¥
The following procedure for estimating the expected time to violate constraints for
a fixed control law is obtained as an immediate consequence of Theorem 1:
Corollary 1: Given a fixed control law ¯u(x, w), suppose there exists a continuous,
non-negative function
¯
V (x, w) such that
L
¯u
¯
V (x, w) + 1 = 0, if (x, w) ∈ G,
¯
V (x, w) = 0, if (x, w) 6∈ G.
(5)
Then, E[τ
x
0
,w
0
,¯u
(G)] =
¯
V (x
0
, w
0
).
We next consider the application of the value iteration approach to (4), assuming,
for simplicity of exposition, that f is continuous in x, and that U is compact. The
proofs of subsequent results are similar to [6,5] and are not reproduced here. We define
a sequence of value functions using the following iterative process:
V
0
≡ 0
V
n
(x, w
i
) = max
v∈U
½
X
j∈J
V
n−1
(f(x, v, w
i
), w
j
)P (w
j
|w
i
, x) + 1
¾
, if (x, w
i
) ∈ G.
n > 0.
(6)
This sequence of functions {V
n
} yields the following properties:
Theorem 2: Suppose the assumptions of Theorem 1 hold. Then the sequence of
functions {V
n
}, defined in (6), is monotonically non-decreasing and V
n
(x, w
i
) ≤ J
x,w
i
,u
∗
for all n, x and w
i
. Furthermore, {V
n
} converges pointwise to V
∗
(x, w
i
) = J
x,w
i
,u
∗
and this convergence is uniform if J
x,w
i
,u
∗
is continuous.
On the computational side, either value iterations or Linear Programming may be
used to numerically approximate the solution to (4).
The value iterations (6) produce a sequence of value function approximations, V
n
,
at specified grid-points x ∈ {x
k
, k ∈ K}, and a stopping criterion is |V
n
(x, w
i
) −
V
n−1
(x, w
i
)| ≤ ² for all x ∈ {x
k
, k ∈ K} and i ∈ J, where ² > 0 is sufficiently small.
In each iteration, once the values of V
n−1
at the grid-points have been determined, linear
or cubic interpolation may be employed to approximate V
n−1
(f(x
k
, v
m
, w
i
), w
j
), on
the right-hand side of (6), where v ∈ {v
m
, m ∈ M} is a specified grid for v. Formally,
the approximate value iterations can be represented as follows,
V
0
(x
k
, w
i
) ≡ 0,
V
n
(x
k
, w
i
) = max
v
m
,m∈M
½
X
j∈J
F
n−1
(f(x
k
, v
m
, w
i
), w
j
) · P (w
j
|w
i
, x
k
) + 1
¾
,
where
F
n−1
(x, w
i
) = Interpolant[V
n−1
](x, w
i
) if (x, w
i
) ∈ G,
and F
n−1
(x, w
i
) = 0 if (x, w
i
) 6∈ G.
(7)
10