Optimal Policies for Payment of Dividends through a Fixed Barrier at

Discrete Time

ul Montes-de-Oca

, Patricia Saavedra

, Gabriel Zacar

ıas-Espinoza

and Daniel Cruz-Su

arez

Departamento de Matem

aticas, Universidad Aut

onoma Metropolitana-Iztapalapa,

Av. San Rafael Atlixco 186, Col. Vicentina, Cd. de M

exico 09340, Mexico

Divisi

on Acad

emica de Ciencias B

asicas, Universidad Ju

arez Aut

onoma de Tabasco,

Km 1 Carr. Cunduac

an-Jalpa, Cunduac

an, Tabasco 86690, Mexico

Keywords:

Reserve Processes, Discounted Markov Decision Processes, Ruin Probability, Optimal Premiums, Dividends.

Abstract:

In this paper a discrete-time reserve process with a ﬁxed barrier is presented and modelled as a discounted

Markov Decision Process. The non-payment of dividends is penalized. The minimization of this penalty

results in an optimal control problem. This work focuses on determining the sequence of premiums that mini-

mize penalty costs, and obtaining a rate for the probability of ruin to ensure a sustainable reserve operation.

1 INTRODUCTION

This work is related to risk theory, which describes

the behavior of the reserve process of an insurance

company. The classic model was introduced by Filip

Lundberg in 1903 (Lundberg, 1909) and developed

by Harald Cram

er in 1930 (Cram

er, 1930). In this

model, the premiums are obtained continuously at a

constant rate and the total amount of claims over a

period of time t is given by a compound Poisson pro-

cess. The main problem of the classical model was

to determine the ruin probability of the reserve pro-

cess. However, currently, several other interesting

problems have been matter of study: minimization

of the ruin probability, the distribution of dividends

to shareholders, the reinsurance problem, the collec-

tion of premiums dependent on the history of each

customer, analysis of the reserve process when claims

have sub-exponential distributions, just to mention a

few (see (Azcue and Muler, 2014), (Dickson, 2005),

(Dickson and Waters, 2004), (Gerber, 1981), (Ger-

ber et al., 2006), (Rolski et al., 1999), and (Schmidli,

2009)).

In particular, the problem of interest for the au-

thors of this article is the deﬁnition of policies for the

distribution of dividends in ﬁxed periods of time when

the claims are of light or heavy tails. This issue is rel-

evant because in the classical model, if the intensity of

the premiums is higher than the average total amount

of claims (the security loading is positive), then with

probability 1, the paths of the reserve tend to inﬁn-

ity when the time t increases indeﬁnitely, (see (De-

Finetti, 1957)). Therefore, dividends appear as a way

to control an unlimited increment of the reserves.

Dividend policies aim to attract shareholders (or

investors), in order to address risks. One possi-

ble policy is to determine the dividend strategy that

maximizes the discounted expected value of a utility

function by means of control techniques. This ap-

proach has been studied in continuous time such as:

(Azcue and Muler, 2014), (Dickson, 2005), (Dick-

son and Waters, 2004), (Gerber, 1981), (Gerber et al.,

2006), and (Schmidli, 2009). On the other hand,

discrete-time problems of risk theory have been stud-

ied, for instance, in (Bulinskaya and Muromskaya,

2014), (Diasparra and Romera, 2009), (Mart

ınez-

Morales, 1991), (Martin-L

of, 1994), (Sch

al, 2004),

and (Schmidli, 2009) who have applied the optimal

control theory in insurance companies. In particular,

in (Martin-L

of, 1994) the control techniques were in-

troduced for the ﬁrst time by means of the theory of

discounted Markov Decision Processes.

The discounted Markov Decision Processes

(MDPs) (see (Hern

andez-Lerma and Lasserre, 1996))

at discrete time are those that are periodically ob-

served under uncertainty on transit of their states and

with the property that they can be inﬂuenced by ap-

plication of controls (Hern

andez-Lerma and Lasserre,

1996). A Markov Decision Process (MDP) is gener-

ally described as follows: at a particular time n, the

system is observed and, depending on its current state,

a control is applied; then a cost is paid and, by a prede-

140

Montes-de-Oca R., Saavedra P., ZacarÃ as-Espinoza G. and Cruz-SuÃ ˛arez D.

Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time.

DOI: 10.5220/0006193701400149

In Proceedings of the 6th International Conference on Operations Research and Enterprise Systems (ICORES 2017), pages 140-149

ISBN: 978-989-758-218-9

termined transition law, the system gets to a new state.

The sequence of controls is called policy, and a way of

assessing their quality is through a performance crite-

rion. The Optimal Control Problem (OCP) consists

in determining a policy which optimizes the perfor-

mance criterion. One way to solve the OCP is using

the technique of dynamic programming introduced by

Bellman in the middle of the last century.

From this perspective, the problem of dividends

is modeled here by using discrete-time MDPs. It is

proposed to work within MDPs since similar con-

trol problems of dams or inventories, sample stor-

age problems, have been resolved successfully, see

(Finch, 1960) and (Ghosal, 1970). On the other hand,

discrete-time is used here as it was suggested in (Li

et al., 2009). This type of analysis is important in it-

self as it presents an approximation of the continuous

problem and as it is also more realistic from the ap-

plications point of view. One approach that will be

followed in this work is to study the problem of div-

idends by ﬁxing an objective capital, (barrier) Z > 0.

If the reserve exceeds Z, then the dividends are dis-

tributed. A model with a ﬁxed barrier reserve of an

insurance company is proposed. The reserve process

is modelled as an MDP whose admissible control be-

longs to a compact subset. The bounds of this sub-

set depend on two principles for premium calculation:

the expectation principle and the standard deviation

principle (see (Dickson, 2005)). The distribution of

the total amount of claims, by time interval, repre-

sents a compound process which is supposed to be

general, in the sense that it only requires for its den-

sity to be continuous almost everywhere. The pro-

posed performance criterion is the expected total dis-

counted cost, where the cost penalizes both the fail-

ure to pay dividends and the difference between the

admissible premiums and a constant which depends

on the standard deviation principle to premium calcu-

lation. In addition, the dynamic programming tech-

nique explicitly determines the optimal solutions, and

on the other hand, a rate for the ruin probability is

established, which aims to determine long periods of

sustainability of the company.

The paper is organized as follows: in the sec-

ond section the mathematical tools that will be used

throughout this work (mainly MDPs and stochastic

orders) are presented. The reserve process with a

ﬁxed barrier is presented in the third section with an

analysis of dividend policies. In the fourth and ﬁfth

sections the main results are given: the optimal pre-

mium and a rate for the ruin probability with a couple

of examples where the theory obtained in this work is

applied. Finally, research conclusions are presented.

2 PRELIMINARIES

This section presents some results on the theory that

will be used to solve the problem stated in the paper.

2.1 Stochastic Orders

Let X be a Borel space (i.e., a Borel subset of a separa-

ble metric space) and suppose that X is complete and

partially ordered. The partial order in X is denoted by

≺. Moreover a function g : X →R is considered to be

increasing if x, y ∈ X, x ≺ y, imply that g(x) ≤ g(y),

where ≤ is the usual order in R. Besides, the Borel

σ-algebra of X is denoted by B(X).

Deﬁnition 2.1. Let X be a complete Borel space and

suppose that X is partially ordered. Let P and P

probability measures on (X,B (X )). It is said that P

dominates P stochastically if

gdP ≤

gdP

for all

g : X → R measurable, bounded and increasing, so

write P

≤ P when this holds.

Remark 2.2. Let P and P

be probability measures on

(R,B(R)). In this case, P

≤P

if F

(x) ≤F(x), for all

x ∈R, where F and F

are the distribution functions of

P and P

, respectively, (see (Lindvall, 1992) p. 127).

Lemma 2.3. ((Cruz-Su

arez et al., 2004), Lemma 2.6)

Let X be a complete Borel space, and suppose also

that X is partially ordered. Let P and P

be proba-

bility measures on (X, B (X)), such that, P

≤P

. Then

∗

dP ≤

∗

, for H

∗

: X → R which is measur-

able, nonnegative, nondecreasing, and (possibly) un-

bounded.

2.2 Discounted Markov Decision

Processes

Let X and Y be complete Borel spaces. A stochas-

tic kernel on X given Y is a function P(·|·) such that

P(·|y) is a probability measure on X for each ﬁxed

y ∈ Y, and P(B|·) is a measurable function on Y for

each ﬁxed B ∈ B(X).

Let (X ,A, {A(x)|x ∈ X}, Q,c) be a discrete-time

Markov Control Model (see (B

auerle and Rieder,

2011) or (Hern

andez-Lerma and Lasserre, 1996) for

notation and terminology). This model consists of the

state space X, the control set A, the transition law Q,

and the cost-per-stage c. For each x ∈ X, there is a

nonempty measurable set A(x) ⊂ A whose elements

are the feasible actions when the state of the system

is x. Deﬁne K := {(x,a) : x ∈ X, a ∈ A(x)} . c is as-

sumed to be a nonnegative and measurable function

on K.

Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time

141

The transition law Q is often induced by an equa-

tion of the form

n+1

= G(x

,ξ

), (1)

n = 0,1,··· , with x

∈ X given, where {x

} and {a

}

are the sequences of the states and controls, respec-

tively, and{ξ

} is a sequence of random variables in-

dependent and identically distributed (i.i.d.), with val-

ues in some space S, common density function ∆, and

independent of the initial state x

; G : K ×S → X is a

measurable function.

Assumption 2.4. (a) A(x) is compact for all x ∈ X;

(b) c is lower semicontinuous and nonnegative;

is, the function h

, deﬁned on K by:

(x,a) :=

h(y)Q(dy|x, a), (2)

is continuous and bounded for every measurable

bounded function h on X.

Using the standard notation and deﬁnitions in

(Hern

andez-Lerma and Lasserre, 1996), Π denotes

the set of all policies and F is the subset of station-

ary policies. Each stationary policy f ∈ F is identi-

ﬁed with the measurable function f : X → A such that

f (x) ∈ A(x) for every x ∈ X.

Remark 2.5. Given an initial state x ∈ X and a sta-

tionary policy f ∈ F, the process determined by (1) is

a homogeneous Markov process with transition kernel

Q(·|x, f ) (see (Hern

andez-Lerma and Lasserre, 1996)

Proposition 2.3.5 p. 19).

Let (X ,A, {A(x)|x ∈ X}, Q, c) be a discrete-time

Markov Control Model, in this paper the perfor-

mance criterion to consider is the Expected Total

Discounted Cost deﬁned as

v(π,x) := E

[

+∞

∑

n=0

c(x

)], (3)

when using the policy π ∈ Π, given the initial state

= x ∈X . In this case, α ∈ (0, 1) is a given discount

factor, and E

denotes the expectation with respect to

the probability measure P

induced by π and x (see

(Hern

andez-Lerma and Lasserre, 1996)).

A policy π

∗

is said to be optimal if

v(π

∗

,x) = V

∗

(x), (4)

for each x ∈X , where

∗

(·) := inf

π∈Π

v(π,·) (5)

is the so-called optimal value function.

Remark 2.6. Assumptions 2.4a) and 2.4b) imply that

c is inf-compact on K, that is, for every x ∈ X and

r ∈ R, the set

(x) := {a ∈A(x)|c(x,a) ≤ r} (6)

is compact. Therefore, Assumption 2.4 implies As-

sumption 1a) and 1b) in (Hern

andez-Lerma and

Lasserre, 1996). Consequently, the validity of the next

lemma is guaranteed.

Lemma 2.7. ((Hern

andez-Lerma and Lasserre,

1996), Theorem 4.2.3 and Lemma 4.2.8) Under As-

sumption 2.4,

(a) The optimal value function V

∗

satisﬁes the opti-

mality equation

∗

(x) = inf

a∈A(x)

{c(x,a) + α

∗

(y)Q(dy|x, a)},

(7)

for each x ∈ X.

(b) There exists an optimal stationary policy f

∗

∈ F

such that

∗

(x) = c(x, f

∗

(x)) + α

∗

(y)Q(dy|x, f

∗

(x)),

(8)

for each x ∈ X.

(x) → V

∗

(x) when n → ∞, where V

is deﬁned

(x) = inf

a∈A(x)

{c(x,a) + α

n−1

(y)Q(dy|x, a)},

(9)

for each x ∈ X, with V

(·) = 0.

3 RESERVE PROCESS

A Risk Process (see (Asmussen, 2010), (Dickson,

2005), and (Schmidli, 2009)) consists of a pair

),t ≥ 0, which describes the premiums earned

and the total amount of claims during the period of

time [0,t], respectively.

The relationship between P

and S

is given as fol-

lows:

= R

+ P

−S

, (10)

t ≥ 0, where R

= u > 0 is the initial reserve of the

company. In this case, R

represents the reserve of the

company at the time t. The process {R

}

t≥0

is called

Reserve Process.

The ruin of the company is given at the instant R

takes a negative value. The main objective then is to

determine the probability of this event, which is done

in the following deﬁnition.

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

142

Deﬁnition 3.1. The ruin probability ψ(u), with initial

reserve u > 0, is deﬁned by

ψ(u) := Pr[τ(u) < +∞] (11)

where τ(u) := in f {t > 0|R

< 0} with τ(u) = +∞ if

> 0 for all t ≥ 0.

In the classical model of Lundberg and Cram

er,

the premiums are determined continuously and de-

terministically, i.e., P

= Ct where C > 0 and t ≥ 0.

In addition, the total amount of claims S

may de-

pend on two process: a homogeneous Poisson process

{N(t)}

t≥0

, with intensity λ > 0, and a claims amounts

process {Y

: i = 1,2, ···}, where Y

are independent

and identically distributed random variables. Thus,

the total amount of claims until time t is given by

N(t)

∑

i=1

, (12)

where S

= 0 if t = 0.

Thus, the classical reserve process is described by

= u +Ct −

N(t)

∑

i=1

= u +Ct −S

Observe that if E[S

] denotes the expectation of S

and E[S

] < +∞, then, taking the expectation in the

last equation, it is obtained that

E[R

] = u + (C −λE[Y

])t. (13)

Choosing C > λE[Y

], it is concluded that the av-

erage reserves of the company grow indeﬁnitely. In

other words, the reserve R

tends to inﬁnity when t

does so with probability 1 −ψ(u). The assumption

C > λE[Y

] is known as the Safety Loading Condi-

tion.

As mentioned above, in the classical model, the

safety loading condition allows an insurance company

reserves to accumulate indeﬁnitely, which is unrealis-

tic. Although there seems to be a controversy about

this point, it has been suggested to establish an up-

per limit (barrier) Z for the accumulation or earnings

in order to sustain the risks (see (Azcue and Muler,

2014), (De-Finetti, 1957), (Dickson, 2005), (Dickson

and Waters, 2004), and (Schmidli, 2009)). To reach

this end, the reserves of the company must be reduced

to Z from time to time, for example, by paying divi-

dends to shareholders.

Remark 3.2. It is important to mention that in a more

general setting, some of the assumptions of the clas-

sical model may be relaxed, e.g.,{N(t)} could be a

non-homogeneous Poisson process or a more general

renewal process. Hence it is possible to assume that

the claim size cumulative distribution function is of a

particular parametric form, eg., gamma, Weibull, etc.

(see Assumption 3.5 and examples 1 and 2, below).

Dividends can be understood as payments made

by a company to its shareholders, either in cash or

in shares. The arguments about the advantages of a

dividend refer to the intention of the investors to earn

income in the present and to reduce uncertainty. For-

mally, the dividends, d

, are deﬁned as d

= [R

−Z]

where [z]

= max{0, z}.

On the other hand, in the existing literature, differ-

ent methods are proposed to determine the premium

value for the safety loading condition to hold (see

(Dickson, 2005) and (Schmidli, 2009)). In this work

the expectation principle will be used.

3.1 Discrete-time Reserve Process

Now, a discrete-time reserve model will be developed.

The discretization is reasonable because, in practice,

decisions of the company about its operations are

taken at ﬁxed points of time (see (Bulinskaya and

Muromskaya, 2014), (Diasparra and Romera, 2009),

(Li et al., 2009), and (Schmidli, 2009)).

Let {R

} be a reserve process with initial reserve

= u > 0, and {t

} be an increasing sequence of

positive real numbers with t

= 0. Then, equation (10)

implies that

n+1

−R

= (P

n+1

−P

) −(S

n+1

−S

), (14)

for n = 0, 1,··· , where (P

n+1

−P

) and (S

n+1

−S

)

are the premiums earned and the total amount of

claims during the period (t

n+1

], respectively.

Let x

:= R

, a

:= (P

n+1

− P

) and ξ

n+1

−S

). Then, without loss of generality, it is

possible assume that t

= n for n > 0. Then, the

discrete-time reserve model is as follows:

n+1

= x

+ a

−ξ

, (15)

with x

= u > 0.

In this case, x

n+1

represents the reserve at time

t = n +1. Moreover, the discrete-time ruin probability

is determine by

(u) := Pr[τ

(u) < +∞] (16)

where τ

(u) := inf{n ≥1|x

≤0} with τ

(u) = +∞ if

> 0 for all n > 0.

According to the ruin probability deﬁned above,

the ruin of the company is attained when x

+ a

−

≤ 0 for some n > 0.

If the following dynamics is considered:

n+1

= [x

+ a

−ξ

]

, (17)

for n = 1,2,··· , with x

= u > 0, then dynamics

in (17) determines the ruin when x

= 0 for some

Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time

143

n = 1,2, ··· . However, just as in the continuous case

model, if the safety loading condition holds, E[x

] →

+∞ when n →+∞.

Remark 3.3. The dynamics described in (17) is

known as the Lindley random walk (see (Asmussen,

2010)) which has various applications, for example,

in storage processes, waiting time model, queue size

models, to name a few (Asmussen, 2010). (See Re-

mark 3.4, below.)

3.2 Reserve Process with a Fixed

Barrier

This subsection provides a reserve process which is

modelled as a discounted Markov Decision Process

at discrete time. The motivation is originated from

the previous subsection, that is, the possibility of dis-

cretizing the reserve process, and the existence of a

ﬁxed barrier which deﬁnes the payments of dividends

(see (Azcue and Muler, 2014), (De-Finetti, 1957),

(Dickson, 2005), and (Mart

ınez-Morales, 1991)).

Let Z be a ﬁxed barrier such that, if at time t

> Z, the surplus X

−Z is used to pay dividends.

Thus, the study of the reserve process focuses on the

reserves below barrier Z. Mathematically, this is de-

scribed by the following dynamics:

n+1

= min{[x

+ a

−ξ

]

,Z} (18)

with x

= u > 0.

In this case, x

, a

and ξ

denotes respectively:

reserve, premium and the total amount of claims of

the company at the beginning of the period (n,n + 1].

Remark 3.4. The dynamics given in equation (18)

has been used to describe storage processes with ﬁnite

capacity such as: dams, inventory, waiting time model

and queue sizes, to name a few (see (Finch, 1960) and

(Ghosal, 1970)).

Assumption 3.5. Suppose that {ξ

} is a sequence of

i.i.d. random variables with values on [0,∞), and a

common distribution F whose density ∆ is continu-

ous almost everywhere (a.e.), with E[ξ] < +∞ (ξ is a

generic element of the sequence {ξ

}).

In the rest of this paper Assumption 3.5 will not be

mentioned in each result, but it is supposed to hold.

Remark 3.6. Observe that Assumption 3.5 considers

general distributions which, in practice, permits us to

work with distributions with light or heavy tails (see

(Azcue and Muler, 2014)).

Using the expectation principle for premiums cal-

culation, it is ensured that the safety loading condition

for the process described in equation (18) holds. De-

ﬁne

K := (1 +ε)E[ξ] (19)

and

M := (1 + β)E[ξ], (20)

where 0 < ε < β. Then, by ((Dickson, 2005) and

(Schmidli, 2009)) K < M, therefore, the admissible

premiums set is the compact subset [K, M]. (Note that

for all premium a ∈ A(x) = [K,M], the safety load-

ing condition is satisﬁed, and β is ﬁxed in order to be

competitive in the insurance market.)

Every time that the reserve is below the barrier Z,

the non-payments of dividends is penalized. There-

fore, the following cost function is proposed:

c(x,a) := [Z −x]

, (21)

for each x ∈[0, +∞) and a ∈ [K,M].

Remark 3.7. This model deﬁnes an MDP: take X =

[0,+∞) as the state space; A = [K, M] as the action

space; A(x) = [K,M] as admissible actions for each

x ∈X; the transition law Q is induced by the function

G(x,a,s) := min{[x + a −s]

,Z} for each (x,a) ∈ K

and s ∈ [0,+∞) (see equation (1)). Finally, the cost

function is deﬁned in (21).

According to Remark 3.7, there is a problem (an

OCP) to determine the sequence of premiums π =

} which optimizes

v(π,x) := E

+∞

∑

n=0

[Z −x

]

, (22)

where x ≥ 0 is the initial reserve, and α is a given

discount factor.

4 OPTIMAL PREMIUMS

In this section the research results are presented using

MDPs theory.

By the deﬁnition of the cost function in (21) it

is concluded that it is nonnegative and continuous.

Moreover, for each x ∈X , A(x) = [K,M] is a compact

set. So, now it is only necessary to show Assumption

2.4c) which is presented in the following lemma.

Lemma 4.1. The transition law Q, induced by (18),

is strongly continuous.

Proof. Let h : X → R be a measurable function

bounded by the constant γ. Using the Variable Change

Theorem ((Ash and Dol

eans-Dade, 2000) p. 52), it

follows that

h(y)Q(dy|x, a) =

∞

h(min{[x+a−s]

,Z})∆(s)ds,

(23)

(x,a) ∈ K.

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

144

Furthermore,

∞

h(min{[x + a −s]

,Z})∆(s)ds = (24)

h(0)(1 −F(x + a)) (25)

+ h(Z)F(x + a −Z) (26)

x+a

x+a−Z

h(x + a −s)∆(s)ds, (27)

(x,a) ∈ K, where F is the common distribution func-

tion of ξ.

Since density ∆ is a continuous function a.e. (see

Assumption 3.5), F is also continuous (see (Ash and

Dol

eans-Dade, 2000), p. 175)

Given the above, it sufﬁces to prove that

x+a

x+a−Z

h(x + a −s)∆(s)ds (28)

is a continuous function on (x,a) ∈ K.

For this purpose, let {(x

)} be a sequence in

K converging to (x,a) ∈ K. By the Variable Change

Theorem ((Ash and Dol

eans-Dade, 2000) p. 52),

x+a

x+a−Z

h(x + a −s)∆(s)ds =

h(y)∆(x + a −y)dy.

(29)

Consider the following functions deﬁned by

(y) := h(y)∆(x

+ a

−y)I

[0,Z]

(y), (30)

(y) := γ∆(x

+ a

−y)I

[0,Z]

(y), (31)

for k = 1,2,··· , y ∈ [0, +∞), where I

(·) denotes the

indicator function on the set B.

Note that |h

|≤g

for all k ≥1. Furthermore, {g

}

converges a.e. to the function g which is deﬁned by

g(y) := γ∆(x + a −y)I

[0,Z]

(y), (32)

y ∈ [0,+∞).

Furthermore,

(y)dy = γ

∆(x

+ a

−y)dy,

= γPr[x

+ a

−Z ≤ ξ ≤ x

+ a

= γ(F(x

+ a

) −F(x

+ a

−Z)),

and, as the distribution F is continuous, then

lim

k→∞

(y)dy =

g(y)dy. (33)

Finally, by the Dominated Convergence Theorem

((Royden, 1988) p. 92)

lim

k→∞

−Z

h(x

+ a

−s)∆(s)ds

= lim

k→∞

(y)dy

lim

k→∞

(y)dy

h(y)∆(x + a −y)dy

x+a

x+a−Z

h(x + a −s)∆(s)ds

and therefore the result holds.

By Lemma 4.1, Assumption 2.4 holds, and there-

fore Lemma 2.7 guarantees the existence of the opti-

mal policy, f

∗

∈F, which, in the context of the reserve

process, describes the sequence of optimum premi-

ums that minimizes the performance index given in

(22).

Lemma 4.2. a) The transition law Q, induced by

(18), is stochastically ordered, i.e.,

Q(·|x,a)

≤ Q(·|w,b) (34)

for each (x,a), (w,b) ∈ K with x ≤ w and a ≤b.

b) The optimal value function V

∗

(·), and the value

iteration functions V

(·), deﬁned in (9), are de-

creasing on X .

Proof. a) Let (x, a), (w, b) ∈K with x ≤w and a ≤b.

Observe that

[x + a −s]

≤ [w + b −s]

, (35)

s ∈ [0,+∞).

On the other hand, if min{[w + b −s]

,Z} = Z,

then min{[x+a −s]

,Z}≤ min{[w + b −s]

,Z},

and if min{[w + b −s]

,Z} = [w + b −s]

, by

(35) min{[x + a −s]

,Z}≤ min{[w + b −s]

,Z}.

Therefore

min{[x + a −s]

,Z}≤ min{[w + b −s]

,Z},

(36)

s ∈ [0,+∞). Thus, by (36) if min{[w + b −

ξ]

,Z} ≤ ς, then min{[x + a −ξ]

,Z} ≤ ς, and

therefore

Q(min{[w + b −ξ]

,Z}≤ ς|w,b) ≤

Q(min{[x + a −ξ]

,Z}≤ ς|x,a). (37)

Finally, by Remark 2.2, the result holds.

b) First it will be shown that V

is decreasing on X.

The proof is made by mathematical induction.

Let x,w ∈ X with x ≤ w. By deﬁnition of V

, for

n = 1,

(x) = inf

a∈A(x)

{[Z −x]

}; (38)

Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time

145

this implies that V

(x) = [Z −x]

, therefore V

decreasing on X.

Now, for n = 2,

(x) = inf

a∈A(x)

{c(x,a)

+ α

(min{[x + a −s]

,Z})∆(s)ds}

= inf

a∈A(x)

{c(x,a)

+ α

[Z −min{[x + a −s]

,Z}]

∆(s)ds}

= inf

a∈A(x)

{c(x,a)

+ α

(Z −min{[x + a −s]

,Z})∆(s)ds}

= inf

a∈A(x)

{[Z −x]

+ αZ

− α

min{[x + a −s]

,Z}∆(s)ds}

= inf

a∈A(x)

{[Z −x]

+ αZ

− α

yQ(dy|x, a)}.

Hence, by part (a) of this lemma and using

Lemma 2.3 with H

∗

(y) = y, y ∈ X, the function

∗

, deﬁned by

∗

(a) := −α

yQ(dy|x,a), (39)

a ∈ [K,M] is decreasing, and so its minimum is

M. This implies that

(x) = [Z −x]

+ αZ −α

yQ(dy|x,a). (40)

Since x ≤ w and after some calculations, it is ob-

tained that V

(w) ≤ V

(x). As x and w are arbi-

trary, then V

is a decreasing function on X. Sup-

pose that V

is decreasing on X for some n > 2.

Again, take x,w ∈ X with x ≤w. Then

n+1

(x) = inf

a∈A(x)

{c(x,a)

+ α

(min{[x + a −s]

,Z})∆(s)ds}

= inf

a∈A(x)

{[Z −x]

+ α

(y)Q(dy|x, a)}.

(41)

Let a ∈ [K,M]. By induction hypothesis and by

the stochastic order of Q, it yields that

[Z −w]

+ α

(y)Q(dy|w,a)

≤ [Z −x]

+ α

(y)Q(dy|x, a),

then taking minimum on a ∈ [K,M] on both sides

of the inequality, it is obtained that V

n+1

(w) ≤

n+1

(x). Therefore, V

n+1

is decreasing. By

Lemma 2.7c), V

(x) → V

∗

(x), x ∈ X, which im-

plies that V

∗

is a decreasing function on X.

Theorem 4.3. The optimal policy for the reserve pro-

cess with dividends, induced by (18), is f

∗

(·) ≡ M.

Proof. Let x ∈X be ﬁxed. By Lemma 2.7, V

∗

satisﬁes

the optimality equation (7), that is,

∗

(x) = inf

a∈A(x)

{[Z −x]

+ α

∗

(y)Q(dy|x, a)}.

Also, by Lemma 4.2, V

∗

is decreasing and Q is

stochastically ordered. Then, if a, b ∈ [K, M], with

a ≤ b, it is obtained that

∗

(y)Q(dy|x, b) ≤

∗

(y)Q(dy|x, a). (42)

Adding [Z −x]

on both sides of the inequality above,

it is concluded that, for a ∈ [K,M],

H(a) := [Z −x]

+ α

∗

(y)Q(dy|x, a) (43)

is a decreasing function and its minimum is reached

in M. Since x is arbitrary, the result follows.

Finally, in this section, by Theorem 4.3 it is ob-

tained that the optimal value function is of the form

∗

(x) = v(M,x) = E

+∞

∑

n=0

[Z −x

]

, (44)

for each x ∈ X. That is, the expected total discounted

cost of the penalties for not reaching the barrier Z, and

therefore not paying the dividends to shareholders is

brought to present value, given the discount factor α.

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

146

5 RATES FOR RUIN

PROBABILITY

This section presents a rate for ruin probability which

permits to determine a period of sustainability for

the company under the optimum reserve process, that

is, the process under the optimal policy (premium)

∗

(·) ≡ M,

n+1

= min{[x

+ M −ξ

]

,Z}, (45)

with x

= u > 0.

To this end,

(u) := Pr[x

= u, x

6= 0, ··· ,x

N−1

6= 0, x

= 0]

(46)

is deﬁned for u > 0 and N > 2.

Observe that ψ

(u) is the ruin probability when

(u) = N, where τ

is the stopping time for the state

zero (see equation (16)).

Theorem 5.1. Let {x

} be the optimal reserve pro-

cess generated for the optimal policy f

∗

≡ M, with

= u > 0 and N > 2. Then

(u) ≤ (Pr[ξ < Z + M])

N−2

·Pr[ξ < u +M]. (47)

Proof. The optimal process {x

} is a homogeneous

Markov process with transition law Q (see Remark

2.5).

Consider the following sets: B

= {x

= u}, B

= 0} and B

= {x

6= 0}, for i = 1,2, ··· ,N −1,

and observe that B

∈ B (X) for i = 1,2,··· ,N.

Then, by Proposition 7.3 p. 130 in (Breiman,

1992),

(u) =

= Pr[x

= u, x

6= 0, ··· ,x

N−1

6= 0, x

= 0]

N−1

···

Q(B

N−1

,M)

Q(dw

N−1

N−2

,M) ···

Q(dw

,M)ρ(dw

where the initial distribution ρ is the Dirac measure

concentred on u.

On the other hand, observe that

Q(B

N−1

,M) ≤ 1. (48)

Therefore

(u) ≤

N−1

···

Q(dw

N−1

N−2

,M)

··· Q(dw

,M)ρ(dw

furthermore, for each i = 1,2, ··· ,N −1, B

⊆{ξ

i−1

+ M} ⊆ {ξ < Z + M}; this implies that

Q(B

i−1

,M) ≤Pr[ξ

i−1

< x

i−1

+M] ≤Pr[ξ < Z +M].

(49)

(u) ≤

N−2

···

Pr[ξ < Z + M]

Q(dw

N−2

N−3

,M) ···

Q(dw

,M)ρ(dw

Finally, iterating this way N −3 times and since ρ

is concentrated in B

, it is obtained that

(u) ≤ (Pr[ξ < Z + M])

N−2

Q(B

|u,M), (50)

where Q(B

|u,M) = Q(x

6= 0|u, M) = Pr[ξ < u +

M].

The examples that follow illustrate the applica-

tion of Theorem 5.1. To do this, the ruin probability

(u) = 0.001 and ν := 1 −ψ

(u) are considered.

Table 1: Gamma distribution.

u κ = 1 years(≈ N) κ = 3 years(≈ N)

1 Z=4.503 19.07 Z=6.928 18.70

2 M=2 19.11 M=4.732 18.99

3 19.12 19.08

4 19.17 19.09

5.1 Example 1

Suppose that ξ has a Gamma distribution with param-

eters (λ,κ) whose density is of the form

∆(s) =

Γ(κ)

(

)

κ−1

−(s/λ)

,s > 0, (51)

where Γ(k) =

+∞

k−1

−s

ds is the Gamma function.

It is known that the Gamma distribution is not an-

alytically integrable, so it is necessary to resort to ta-

bles for this distribution given in (Wilks, 2011) Ap-

pendix B Table B.2.

In this case, the optimal premium is

M = κ + β

√

κ, (52)

where β is the loading factor.

Given λ = β = 1, and different values of u, Z, M,

and their respective period of sustainability (in years)

are calculated for κ = 1,3. These values are shown in

Table (1).

5.2 Example 2

Suppose that ξ has a Weibull distribution with param-

eters (λ,κ). It is known that the distribution function

is as follows:

F(s) = 1 −e

−(s/λ)

,s > 0. (53)

Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time

147

Since F(M + Z) = ν, it follows that

Z = λ(ln(1 −ν)

−1

)

1/κ

−M. (54)

In this case, the optimal premium is

M = λ(Γ(1 + 1/κ) + β

Γ(1 + 2/κ) −Γ

(1 + 1/κ)),

(55)

where β is the loading factor.

Given λ = β = 1, and different values of u, Z, M,

and their respective period of sustainability are calcu-

lated for κ = 0.8, 0.6. These values are shown in Table

(2).

Table 2: Weibull distribution.

u κ = 0.8 years(≈N) κ = 0.6 years(≈ N)

1 Z=8.64 19.00 Z=20.91 18.98

2 M=2.56 19.08 M=4.14 19.03

3 19.12 19.07

4 19.15 19.10

6 CONCLUSIONS

With the theory presented in this paper, a discrete time

reserve process with a ﬁxed barrier was determined,

when it was modelled as a discounted Markov De-

cision Process. The dynamics presented in Equation

(18) describes the behavior of the reserves of the com-

pany when these are below the barrier. This allows us

to set a penalty to take into account non-payments of

dividends. By controlling the process generated by

premiums, it is found that the optimal policy is M.

On the other hand, the rate presented in Theorem

5.1 permits to determine the periods of sustainability

of the company given a ruin probability and an initial

reserve. This bound depends on the distribution of

the total amount of claims per time interval. It should

also be noted that these random variables are only as-

sumed to have continuous density almost everywhere,

with ﬁnite ﬁrst and second moments. This condition

is satisﬁed by a wide range of distributions. The ex-

amples illustrate how to apply the rate in the case of

distribution with light or heavy tails.

ACKNOWLEDGEMENTS

R. Montes-de-Oca, P. Saavedra, and D. Cruz-Su

arez

dedicate this article to the memory of their co-worker

and co-author of the present work, Gabriel Zacar

ıas-

Espinoza, whose sensible death occured on Novem-

ber, 10, 2015.

This work was partially supported by CONACYT

exico) and ASCR (Czech Republic) under Grant

No. 171396.

REFERENCES

Ash, R. B. and Dol

eans-Dade, C. (2000). Probability and

Measure Theory. Elsevier, London, 2nd edition.

Asmussen, S. (2010). Ruin Probability. World Scientiﬁc,

Singapore, 2nd edition.

Azcue, P. and Muler, N. (2014). Stochastic Optimiza-

tion in Insurance a Dynamic Programming Approach.

Springer, London.

Breiman, L. (1992). Probability. SIAM, Berkeley.

Bulinskaya, Y. G. and Muromskaya, A. (2014). Discrete-

time insurance model with capital injections and rein-

surance. Methodol. Comput. Appl. Probab.

auerle, N. and Rieder, U. (2011). Markov Decision Pro-

cesses with Applications to Finance. Springer, Berlin.

Cram

er, H. (1930). On the Mathematical Theory of Risk.

Skandia Jubillee Volume, Stockholm.

Cruz-Su

arez, D., de Oca, R. M., and Salem-Silva, F. (2004).

Conditions for the uniqueness of optimal policies of

discounted markov decision processes. Math. Meth-

ods Oper. Res., 60:415–436.

De-Finetti, B. (1957). Su un’impostaziones alternativa della

teoria collectiva del rischio. Trans. XV. Int. Congr.

Act., 2:433–443.

Diasparra, M. A. and Romera, R. (2009). Bounds for the

ruin probability of a discrete-time risk process. J.

Appl. Probab., 46:99–112.

Dickson, D. C. M. (2005). Insurance Risk and Ruin. Cam-

bridge University Press, Cambridge.

Dickson, D. C. M. and Waters, H. R. (2004). Some optimal

dividend problems. ASTIN Bull., 34:49–74.

Finch, P. D. (1960). Deterministic costumer impatience in

the queueing system gi/m/1. Biometrika, 47:45–52.

Gerber, H. U. (1981). On the probability of ruin in the pres-

ence of a linear dividend barrier. Scand. Acutarial J.,

pages 105–115.

Gerber, H. U., Shiu, E. S. W., and Smith, N. (2006). Max-

imizing dividends without bankruptcy. ASTIN Bull.,

36:5–23.

Ghosal, A. (1970). Some Aspects od Queueing and Storage

System. Springer Verlag, New York.

Hern

andez-Lerma, O. and Lasserre, J. B. (1996). Discrete-

time Markov Control Processes: Basic Optimality

Criteria. Springer Verlag, New York.

Li, S., Lu, Y., and Garrido, J. A. (2009). A review of

discrete-time risk models. Rev. R. Acad. Cien. Serie

A. Mat., 103(2):321–337.

Lindvall, T. (1992). Lectures on the Coupling Method. Wi-

ley, New York.

Lundberg, F. (1909).

Uber die theorie der ruckversicherung.

Transactions of the VIth International Congress of Ac-

tuaries, 1:877–948.

ICORES 2017 - 6th International Conference on Operations Research and Enterprise Systems

148

Martin-L

of, A. (1994). Lectures on the use of control theory

in insurance. Scand. Actuarial J., pages 1–25.

Mart

ınez-Morales, M. (1991). Adaptive Premium in an In-

surance Risk Process, Doctoral Thesis. Texas Tech

University, Texas.

Rolski, T., Schmidli, H., Schmidt, V., and Teugels, J. L.

(1999). Stochastic Processes for Insurance and Fi-

nance. Wiley, Chichester.

Royden, H. L. (1988). Real Analysis. Macmillan, New

York.

Schmidli, H. (2009). Stochastic Control in Insurance.

Springer, London.

Sch

al, M. (2004). On discrete-time dynamic programming

in insurance: Exponential utility and minimizing the

ruin probability. Scand. Actuarial J., pages 189–210.

Wilks, D. S. (2011). Statistical Methods in the Atmospheric

Sciences. Academic Press, Burlington.

Optimal Policies for Payment of Dividends through a Fixed Barrier at Discrete Time

149