tion to the model of stochastic games and on the other
hand some of its crucial aspects.
2.1 Definitions and Concepts
Stochastic Games (SG) (Shoham et al., 2003; Hansen
et al., 2004) are defined by the tuple:
< Ag,{A
i
: i = 1.. .|Ag|},{R
i
: i = 1.. .|Ag|},S,T >
• Ag: is the finite set of agents.
• A
i
: is the finite set of actions (or pure strategies)
available to agent i (i ∈ Ag).
• R
i
: is the immediate reward function of agent i,
R
i
(a) → R, where a is the joint-action defined as
a ∈ ×
i∈Ag
A
i
and is given by a = ha
1
,.. .,a
|Ag|
i.
• S: is the finite set of environment states.
• T: is the stochastic transition function,
T : S × A × S → [0, 1], indicating the probability
of moving from a state s ∈ S to a state s
0
∈ S by
running the joint-action a.
The particularity of stochastic games is that each
state s can be considered as a matrix game M(s). At
each step of the game, the agents observe their envi-
ronment, simultaneously choose actions and receive
rewards. The environment transitions stochastically
into a different state M(s
0
) with a probability P(s
0
|s,a)
and the above process repeats. The goal for each
agent is to maximize the expected sum of rewards it
receives during the game.
2.2 Equilibrium in Stochastic Games
Stochastic games have reward functions which can
be different for every agent. In certain cases, it may
be difficult to find policies that maximize the perfor-
mance criteria for all agents. So in stochastic games,
an equilibrium is always looked for every state. This
equilibrium is a situation in which no agent, taking the
other agents’ actions as given, can improve its perfor-
mance criteria by choosing an alternative action: we
find here the definition of the Nash equilibrium (Nash,
1950).
Definition 1. A Nash Equilibrium is a set of strate-
gies (actions) a
∗
such that:
R
i
(a
∗
i
,a
∗
−i
) > R
i
(a
i
,a
∗
−i
) ∀i ∈ Ag, ∀a
i
∈ A
i
(1)
2.2.1 Strategic Dominance
When the number of agents is large, it becomes dif-
ficult for everyone to consider the entire joint-action
space. This may involve a high cost of matrices con-
struction and resolution. To reduce the joint-action
set, most research (in the game theory) focused on
studying the concepts of plausible solutions. Strate-
gic dominance (Fudenberg and Tirole, 1991; Leyton-
Brown and Shoham, 2008) represents one of the most
widely used concept, seeking to eliminate actions that
are dominated by others actions.
Definition 2. A strategy a
i
∈ A
i
is said to be strictly
dominated if there is another strategy a
0
i
∈ A
i
such as:
R
i
(a
0
i
,a
−i
) > R
i
(a
i
,a
−i
) ∀a
−i
∈ A
−i
(2)
Thus a strictly dominated strategy for a player
yields a lower expected payoff than at least one other
strategy available to the player, regardless of the
strategies chosen by everyone else. Obviously, a ra-
tional player will never use a strictly dominated strat-
egy. The process can be repeated until strategies are
no longer eliminated in this manner. This prediction
process on actions, is called ”Iterative Elimination of
Strictly Dominated Strategies” (IESDS).
Definition 3. For every player i, if there is only one
solution resulting from the IESDS process, then the
game is said to be dominance solvable and the solu-
tion is a Nash equilibrium.
However, in many cases, the process ends with a large
number of remaining strategies. To further reduce
the joint-action space, we could relax the principle of
dominance and so include weakly dominated strate-
gies.
Definition 4. A strategy a
i
∈ A
i
is said to be weakly
dominated if there is another strategy a
0
i
∈ A
i
such as:
R
i
(a
0
i
,a
−i
) > R
i
(a
i
,a
−i
) ∀a
−i
∈ A
−i
(3)
Thus the elimination process would provide more
compact matrices and consequently reduce the com-
putation time of the equilibrium. However, this pro-
cedure has two major drawbacks: (1) the elimination
order may change the final outcome of the game and
(2) eliminating weakly dominated strategies, can ex-
clude some Nash equilibria present in the game.
2.2.2 Best-response Function
As explained above, the iterative elimination of dom-
inated strategies is a relevant solution, but unreliable
for an exact search of equilibrium. Indeed discarding
dominated strategies narrows the search for a solu-
tion strategy, but does not identify a unique solution.
To select a specific strategy requires to introduce the
concept of Best-Response.
Definition 5. Given the other players’ actions a
−i
,
the Best-Response (BR) of the player i is:
BR
i
: a
−i
→ argmax
a
i
∈A
i
R
i
(a
i
,a
−i
) (4)
ICAART 2012 - International Conference on Agents and Artificial Intelligence
92