Quantitative Estimates of Stability in Controlled GI | D | 1 | Queueing
Systems and in Control of Water Release in a Simple Dam Model
Evgueni Gordienko and Juan Ruiz de Ch
´
avez
Department of Mathematics, Autonomous Metropolitan University - Iztapalapa,
San Rafael Atlixco 186, col. Vicentina, C.P. 09340, Mexico City, Mexico
Keywords:
Queueing, Inventory and Water Release Models, Markov Control Process, Stability Index, Prokhorov Metric.
Abstract:
We consider two applied discrete-time Markov control models: waiting times process in GI | D | 1 | queues
with a controlled service rate and water release control in a simple dam model with independent water in-
flows. The stochastic dynamics of both models is determinated by a sequence of independent and identically
distributed random variables with a distribution function F. In the situation when an available approximation
˜
F is used in place of the unknown F, we estimate the deterioration of performance of control policies optimal
with respect to the total discounted cost and the average cost per unit of time. For this purpose we introduce a
stability index and find uppers bounds for this index expressed in terms of the Prokhorov distance between the
distributions functions F and
˜
F. When
˜
F
˜
F
m
is the empirical distribution function obtained from a sample
of size m in average the stability index is less than a constant times m
1/3
.
1 INTRODUCTION: THE
PROBLEM OF CONTINUITY
(STABILITY) ESTIMATION IN
CONTROL MODELS
Optimization of control policies in queueing systems,
inventory and dam models is nowadays important the-
oretical and engineering issue because of great de-
velopment of telecommunications and computer net-
works and increasing difficulties with water supply.
In this communication we investigate continuity (or
“stability”) of optimal dynamic control in the fol-
lowing applied discrete-time Markov control process
(MCP’s):
A. Consecutive waiting times in GI | D | 1 | queues
with controlled service rates a
n
[γ,
¯
γ] (0, ):
X
n+1
= max{0,X
n
+ a
n
ξ
n
}, n = 0, 1,2,... . (1)
where X
n
is the waiting time of n-th job (client), while
ξ
n
is interarrival time between n-th and (n+1)-th jobs.
(Equations (1) also describe stock level in some in-
ventory models, see e.g. (Anderson et al., 2012), and
(Kitaev and Rykov, 1995) for control problem settings
in queues).
B. Water stocks {X
n
, n = 0,1, 2,. .. } in the following
simple model of water release control (see, e.g. (As-
mussen, 1987), (Bae et al., 2003), (Hern
´
andez-Lerma,
1989)):
X
n+1
= min{X
n
a
n
+ ξ
n
,M}, n = 0, 1,2,. .. (2)
(with n used for days, weeks or months). In (2) M is
the capacity of reservoir, ξ
n
is the water inflow, and
a
n
[0,X
n
] is the controlled water consumption in the
n-th period.
Remark 1. For the sake of brevity (in what follows)
we allow the same letters to denote states of distinct
processes.
In equations (1) and (2), ξ
0
,ξ
1
,... are assumed to
be independent and identically distributed (i.i.d) non-
negative random variables with a common distribu-
tion function (d.f.) F.
Let Q(π) be a chosen criterion of optimization of
control policies π (or an expected cost of a policy π,
as specified below in sections 2 and 3).
A control policy π = {a
0
,a
1
,...} is a sequence
of control actions (see e.g. (Dynkin and Yushke-
vich, 1979), (Hern
´
andez-Lerma and Lasserre, 1996)
for definitions). The “original task” of a controller is
to find (or to approximate) an optimal policy π
(if it
exists)
Q(π
) = inf
π
Q(π). (3)
However, in our stability (continuity) estimation set-
ting this goal can not be accomplished (at least di-
rectly) since we suppose that the d.f. F is unknown (at
least partly) to the controller, and it is replaced by an
available approximation
˜
F obtained either from some
theoretical considerations or from statical estimators
of unknown parameters, (or the whole F). One pos-
sible way to go around is to consider instead of (1),
309
Gordienko E. and Ruiz de Chávez J..
Quantitative Estimates of Stability in Controlled GI | D | 1 | #INF# Queueing Systems and in Control of Water Release in a Simple Dam Model.
DOI: 10.5220/0004012603090312
In Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2012), pages 309-312
ISBN: 978-989-8565-21-1
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
(2) the corresponding “approximating” Markov con-
trol processes:
˜
X
n+1
= max{0,
˜
X
n
+ a
n
˜
ξ
n
}, n = 1, 2,...(4)
˜
X
n+1
= min{
˜
X
n
a
n
+
˜
ξ
n
,M}, n = 1, 2,...(5)
where
˜
ξ
0
,
˜
ξ
1
,... are i.i.d random variables with the
d.f.
˜
F.
Suppose that using the same criterion Q one can
find (at least “theoretically”) an optimal control pol-
icy
˜
π
for process (4) (or for process (5)). Neverthe-
less we insist that the original process is given by
equations (1) (or by (2)) and therefore the controller
aims to find
˜
π
only in order to apply it to control pro-
cess (1) (or (2), respectively). This means that
˜
π
is
used as an available approximation to an unavailable
policy π
The below stability (continuity) index (see e.g.
(Gordienko and Salem, 2000), (Gordienko et al.,
2009)) gives a quantitative measure of the accuracy
of such approximation:
:= Q(
˜
π
) Q(π
) 0. (6)
Here π
is the optimal policy satisfying (3). In other
words, expresses the extra cost paid for using
˜
π
instead of π
.
There are many examples of unstable problem of
control optimization in MCP’s (see e.g. (Gordienko
and Salem, 2000), (Gordienko et al., 2009)). In these
examples the values of in (6) are greater than a posi-
tive constant (sometimes large enough), in spite of the
fact that
˜
F approaches F in certain strong sense (for
example, in the total variation metric).
For the above mentioned applied MCP’s we estab-
lish two (for two different optimization criteria) sta-
bility inequalities of the form:
Kπ(F,
˜
F), (7)
where K is an explicitly calculated constant, and π is
the Prokhorov (sometimes called “Levy-Prokhorov”)
metric on the space of d.fs. The convergence in π
is equivalent to the weak convergence ( convergence
in distribution). Inequality (7) asserts that if
˜
F is a
“good” approximation to F (in sense of closeness in
π), then if one uses the policy
˜
π
then he/she pays
“almost the same amount” as applying the optimal
policy π
. Note that, in spite of lack of complete in-
formation on F, sometimes it is possible to find an
upper bound for values of π(F,
˜
F). When
˜
F =
˜
F
m
is
the empirical distribution function obtained from the
sample X
1
,X
2
,..., X
m
, of size m, then Eπ(F,
˜
F
m
)
const · m
1/3
, provided that
R
0
x
3/2
F(dx) < .
It is worth noting that in the theory of stochas-
tic processes (controlled or not) the term “stability”
has had a variety of meanings (see, for instance,
(MacPhee and M
¨
uller, 2006), (Meyn and Tweedie,
2009)). We use this term since we did not found any-
thing better. The study of quantitative estimation of
continuity in noncontrollable queues (under perturba-
tion of “governing” d.f.s) is a rather developed topic
(see, eg. (Zolotarev, 1976), (Kalashnikov, 1983),
(Abramov, 2008), (Gordienko and Ruiz de Chavez,
1998)). For controlled Markov processes the “sta-
bility” was studied (in the another framework), for
instance, in (Van Dijk, 1988), (Van Dijk and Sladky,
1999). “Stability inequalities” for MCP’s (using prob-
ability metrics distinct from the Prokhorov one) were
found in several relevant papers (see for instance
(Gordienko and Salem, 2000), (Gordienko et al.,
2009)). Surely, “stability” of control policies as it is
considered here in some way is connected with the
sensitivity theory, and with the profound theory of
robust control (which, in particular, are successfully
used in dam models, see e.g. (Barbu and Sritharan,
1998), (Litrico and Georges, 2001)).
The findings in this communication are new the-
oretical results on estimation of “stability” of applied
control process (in terms of the Prokhorov metric.)
Stability estimation in queueing models may play a
role in design of “reasonable” control policies under
uncertainties about probability distributions. The wa-
ter release model considered here is quite elementary,
and so has restricted applications. However, our ap-
proach can be extended to more realistic dam models.
2 STABILITY ESTIMATION
WITH RESPECT TO THE
EXPECTED TOTAL
DISCOUNTED COST
Equations (1) represent the MCP on the state space
X = [0,) and the sets of admissible control actions
A(x) = [γ, γ ], x X (with γ < γ being the assigned
extreme values of a service rate). In turn, equations
(2) define the MCP on the state space X = [0, M] with
the sets of admissible actions A(x) = [0, x], x X.
Suppose that for each of the above mentioned pro-
cesses a measurable real-valued one-step cost func-
tion c(x,a), x X, a A(x) is specified. Thus, at
stage t the controller “pays” c(x
t
,a
t
) if the process oc-
curs in the state x
t
and the control action a
t
is selected.
For example, for the controlled queue (1) c(x,a) can
be increasing in the variable a representing a service
rate, and for process (2) c(x,a) can be decreasing as
a function of water consumption a. In fact we do not
need such detailed specification. We will only sup-
pose throughout the rest of the paper that the function
ICINCO2012-9thInternationalConferenceonInformaticsinControl,AutomationandRobotics
310
c is bounded and it satisfies the Lipschitz condition in
both arguments.
Let α (0, 1) be a given discount factor. For any
fixed initial state x X of process (1) or (2), and for
any chosen control policy π, the total expected dis-
counted cost is defined as follows:
V (x,π) := E
π
x
n=0
α
n
c(X
n
,a
n
), (8)
where E
π
x
stands for the expectation corresponding the
probability measure (on the space of trajectories) gen-
erated by application of the policy π with the initial
state x of the process. Similarly, denoting by
˜
E
π
x
the
respective expectation for process (4) (or 5) we de-
fine by the expression similar to (8) the expected dis-
counted cost
˜
V (x,π).
The control policy f ( f
1
, f
2
,...) is called sta-
tionary if on each stage t = 0, 1,2, .. . this policy pre-
scribes to choose the control action a
t
= f (X
t
), where
f : X A :=
[
xX
A(x)
is a measurable function such that f (x) A(x), x
X. Along with the aforementioned restrictions on the
one-step function, we assume that the d.f.s F and
˜
F
are continuos.
Fixing any pair of MCP’s either (1)-(4) or (2)-(5)
we can prove the following assertion.
Proposition 1. There exist stationary optimal poli-
cies f
and
˜
f
. That is:
V (x, f
) := inf
π
V (x,π), x X;
˜
V (x,
˜
f
) := inf
π
˜
V (x,π), x X.
In view of this proposition the stability index in
(6) (now with Q = V ) can be rewritten as follows:
(x) = V (x,
˜
f
) V (x, f
) 0, x X.
Theorem 1. For both pairs of MCP’s (1)-(4) and (2)-
(5) we obtain that
sup
xX
(x) Kπ(F,
˜
F), (9)
where π is the Prokhovov metric, and K is an explic-
itly calculated constant (depending only on α and on
the characteristics of the one-step function c.)
This theorem is proved applying the technique of
contractive operators.
3 STABILITY ESTIMATION
WITH RESPECT TO THE
AVERAGE EXPECTED COST
In this section we consider another classical optimiza-
tion criteria – long-run average expected cost per unit
of time: For x X and the policy π it is defined as
follows:
J(x,π) := limsup
n
n
1
E
π
x
n1
t=0
c(X
t
,a
t
). (10)
Using Q = J in (6) we offer an upper bound for
this stability index only for the MCP’s defined by
equations (2) and (5) (since the ergodicity conditions
needed to establish such bound are not easy to guar-
antee for the processes (1) and (4). In any case this
requires further investigations.)
Apart from the conditions on the one-step func-
tion c pointed in section 2, in the present section we
make much more restrictive assumptions about prop-
erties of the distribution functions F and
˜
F. We sup-
pose that both d.f.s F and
˜
F have continuously differ-
entiable densities (denoted by g and ˜g) with bounded
supports. Moreover it is assumed that g and ˜g are
strictly positive on some open interval (0,Γ) (0,M].
Remark 2. We use the above conditions only to out-
line the idea. In fact these assumptions can be signifi-
cantly relaxed (for instance, removing the assumption
about the finiteness of supports).
Similarly to (10) (defined for process (2)) we write
the average cost for the approximating control process
(5) as follows:
˜
J(x,π) := limsup
n
n
1
˜
E
π
x
n1
t=0
c(
˜
X
t
,a
t
).
Proposition 2. Under the assumption made we ob-
tain that:
(i) The minimal average costs
inf
π
J(x,π) J
and inf
π
˜
J(x,π)
˜
J
do not depend on the initial states x of the pro-
cesses.
(ii) There exist stationary optimal control policies f
and
˜
f
(for (2) and (5), respectively):
J(x, f
) J( f
) = J
,
˜
J(x,
˜
f
)
˜
J(
˜
f
) =
˜
J
.
Thus for Q = J the stability index in (6) is ex-
pressed in the following way:
= J(
˜
f
) J( f
) 0.
We are ready to state our second result.
Theorem 2. Under the assumption made for the
MCP’s (2) and (5) we obtain:
K
π(F,
˜
F), (11)
where, again, π is the Prokhovov distance between the
d.f.s F and
˜
F, and K
is an explicitly calculated con-
stant (depending on characteristics of the densities g,
˜g and of the function c).
QuantitativeEstimatesofStabilityinControlledGI|D|1|PQueueingSystemsandinControlofWaterReleaseina
SimpleDamModel
311
Remark 3. The proof of inequality (11) also uses
methods of contractive operators, but in a much more
sophisticated way than in case of the proof of (9).
The needed contractive properties of dynamics pro-
gramming operator are demonstrated using suitable
ergodic features of processes. (For this we need the
above mentioned restrictions on densities.)
The last stability inequality we offer is more rough
than (11), but it is expressed in terms of more trans-
parent distance (the total variation metric).
Corollary 1. Under assumption made
K
Z
0
|g(s) ˜g(s)| ds.
4 CONCLUSIONS
In this talk we study two important applied control
models which involve a not completely know distri-
bution function F of independent and identically dis-
tributed random variables. Supposing that F is ap-
proximated by an available distribution function
˜
F
(obtained from statistical estimations or theoretical
simplifications) we evaluate quantitatively the perfor-
mance worsening when an approximating control pol-
icy is used in place of an unavailable optimal policy.
For such evaluation we introduce the so called sta-
bility index, and prove inequalities bounding this in-
dex in terms of the Prokhorov distance between F and
˜
F. These bounds can be useful to measure quality
of service rates control in computer networks and of
control procedures in certain elements of water sup-
ply systems.
REFERENCES
Abramov, V. (2008). Continuity theorems for the M/M/1/n
queueing system. Queueing Syst., 59(1):68–86.
Anderson, D., Sweeney, D., Thomas, A., Williams, T.,
Camm, J., and Martin, K. (2012). Introduction to
Management Science: Quantitative Approaches to
Decision Making, Revised. South-Western Cengage
Learning, USA, 13-th edition.
Asmussen, S. (1987). Applied Probability and Queues.
John Willey Chichester.
Bae, J., Kim, S., and Lee, E. (2003). Average cost under
the P
M
λ,τ
policy in a finite dam with compound Poisson
inputs. J. Appl. Probab., 40:519–526.
Barbu, V. and Sritharan, S. (1998). H-infinity control theory
of fluid dynamics. Proceedings of the Royal Society of
London, Ser. A., 545:3009–3033.
Dynkin, E. and Yushkevich, A. (1979). Controlled Markov
Processes. Springer, New York.
Gordienko, E., Lemus-Rodriguez, E., and Montes-de Oca,
R. (2009). Average cost Markov control processes:
stability with respect to the Kantorovich metric. Math.
Meth. Oper. Res., 70:13–33.
Gordienko, E. and Ruiz de Chavez, J. (1998). New esti-
mates of continuity in M/GI/1/ queues. Queueing
Syst., 29:175–188.
Gordienko, E. and Salem, F. (2000). Estimates of stabil-
ity of Markov control processes with unbounded cost.
Kybernetika, 36:195–210.
Hern
´
andez-Lerma, O. (1989). Adaptive Markov Control
Processes. Springer, New York.
Hern
´
andez-Lerma, O. and Lasserre, J. (1996). Discrete-
time Markov Control Processes, Basic Optimization
Criteria. Springer, New York.
Kalashnikov, V. (1983). The analysis of continuity of
queueing systems. In Probability Theory and Math-
ematical Statistics (Tbilisi, 1982). Lecture Notes in
Math.(1021) Springer.
Kitaev, M. and Rykov, V. (1995). Controlled Queueing Sys-
tems. CRC Press, Boca Raton FL.
Litrico, X. and Georges, D. (2001). Robust LQG control of
single input multiple outputs dam-river systems. In-
ternational Journal of Systems Science, 32:795–805.
MacPhee, I. and M
¨
uller, L. (2006). Stability criteria for con-
trolled queueing systems. Queueing Syst., 52(3):215–
229.
Meyn, S. and Tweedie, R. (2009). Markov Chains and
Stochastic Stability. Cambridge University Press,
Cambridge, 2nd edition.
Van Dijk, M. (1988). Perturbation theory for unbounded
Markov reward processes with applications to queue-
ing. Adv. Appl. Probab., 20:91–111.
Van Dijk, M. and Sladky, K. (1999). Error bounds for non-
negative dynamic models. J. Optim. Theory Appl.,
101:449–474.
Zolotarev, V. (1976). On stochastic continuity of queuing
systems of type G |G| 1. Theor. Probab. Appl., 21:250–
269.
ICINCO2012-9thInternationalConferenceonInformaticsinControl,AutomationandRobotics
312