Figure 1: The MDM process.
are specified by the sensor sensitivity and range. Dur-
ing the signal processing, the standard normalization
formula
normalized
value =
attribute
value − k
l − k
(1)
is used. The attribute value is normalized to the in-
terval of h0; 1i and variables k, l are expressing
the lower and upper limit of the sensor scanning
range.
The set of initiatory conditions is used for
the decision-making situation recognition in the phase
2. It has to provide the agent with the ability to re-
act to unexpected changes in the environment dur-
ing its target pursue. Also, it must allow the agent
to change its goal through the decision-making pro-
cess pertinently, e.g. if the prior goal is accomplished
or unattainable. In the phase 3, the conditions of ap-
plicability characterize the boundary that expresses
the relevance of using the variant in the decision-
making process. The most important step is the phase
4, where the convenience appraisal of applicable vari-
ants is performed. This step plays the key part
in the decision-making procedure. The following for-
mula is used:
conv =
X
i
w(inv(norm(a
v
i
))), i = 1, . . . , z, (2)
where v stands for the total number of variants and
z is total number of attribute values that are needed
for computation of convenience value.
The convenience value for the each of assorted
variants is obtained. Attributes a
v
i
stand for presump-
tive values of universum-configuration and the norm
function normalizes the value of the attribute and is
mentioned above (1). The function inv is important,
as it represents reversed value of difference between
real attribute value and ideal attribute value:
inv(current
val) = 1 − |ideal val − current val|.
(3)
The optimal variant remains constantly defined
by m-nary vector of attributes, where m ≤ z, for
the each of the decision-making situations and at-
tributes a
v
i
differ for each variant other than the op-
timal variant. There is a final number of activities that
the agent is able to perform. As the inverse values
of difference between the real and ideal variant are
used, in the most ideal case, the convenience value
will be equal to 1, and in the worst case it will be
close to 0. Im(inv) = (0; 1i . The lower open bound-
ary of the interval is useful, because troubles related
to computation with zero (dividing operations) may
be avoided.
The function w assigns the importance value
(weight) to the each attribute. The machine learning is
realized by proper modifications of the weight func-
tion. Importance of attributes differs in accordance
with the actual state of the agent. E.g. energetically
economical solution would be preferred when the bat-
tery is low, fast solution is preferred when there is
a little time left, etc. Precise definitions of weight
functions are presented in (Ram
´
ık, 1999), (Fiala et al.,
1997). In the phase 5, the variant with the highest
convenience value is selected and its realization is car-
ried out in the phase 6. During processing of the se-
lected solution, the agent is scanning the environ-
ment and if the decision-making situation is recog-
nized, the whole sequence is repeated. The evaluation
function (e.g. reinforced learning functions examples
in (Kub
´
ık, 2000), (Pfeifer and Scheier, 1999), (Weiss,
1999)) provides the feedback and supports the best
variant selection, as it helps the agent in its goal pur-
sue. Based on the scanned environmental data, modi-
fications of the function w are made during the learn-
ing process.
3 THE PRACTICAL PART
As it was said above the robot soccer game is
a strongly dynamical environment (Kim et al., 2004).
Any attempt to follow a long-term plan will very
probably result in a failure. This is a reason why
there are many solutions of this problem founded
on the agent-reactive basis. But such reactive ap-
proach lacks potential to develop or follow any strat-
egy apart from the one implemented in its reactive be-
havior.
On the other hand, the MDM principle, in general,
provides a large variety of solutions and the strategy
may be formed by the modifications of the weight co-
efficients value. Such modifications change behavior
of the team as a whole and/or its members individu-
ally.
Simple reactions do not provide sufficient effort po-
tential, or, in other words, the efficient goal pursue.
There is a need of a quick, yet more complicated, ac-
tions, that would allow us to build up a strategy. Also,