way of dealing with the task that decides which struc-
tures must be constructed in the game StarCraft, using
the algorithm Continual Online Evolutionary Plan-
ning (COEP). There are related works that try to im-
prove battle skills of RTS automatic players, as in
the study presented in (Shao et al., 2017), which de-
scribes the development of a decision module for the
game StarCraft, based on a neural network with re-
inforcement learning, that can point out an action for
each living unit. In (Stanescu and
ˇ
Certick
`
y, 2016) an-
other study is presented that tries to improve an au-
tomatic player specialized in battles, this study uses
a prediction method to indicate the most likely com-
bination of units produced by the enemy in an RTS
game like StarCraft. In the work described in (Niel
et al., 2018) has developed a simple custom RTS
game, in which the main goal of the players is to de-
fend their bases and destroy the enemy’s base. In this
work was used reinforcement learning (RL) combined
with a multi-layer perceptron (MLP) to determine
how the agent will perform the tasks in the game. The
Q-learning was tested against Monte Carlo learning as
reinforcement learning algorithms, and two different
methods for assigning rewards to agents were tested,
the individual and the sharing reward. The results
showed that the combination of Q-learning and indi-
vidual rewards presented the best win-rate. The works
presented in (Justesen and Risi, 2017; Shao et al.,
2017; Stanescu and
ˇ
Certick
`
y, 2016; Niel et al., 2018)
do not deal with novelty identification and there is no
attempt to represent different moments of the game
using appropriate features, to improve the strategy de-
tection, which are great differences to the present pa-
per.
The work in (
´
Alvarez Caballero et al., 2017) used
supervised learning techniques as MLP, Random For-
est and Logistic Regression, over a big amount of
data obtained from StarCraft replays. The objective
of this work is to predict the winner of a match in the
game StarCraft. This work proved that the use of ap-
propriate StarCraft features will increase the accuracy
of this prediction, which can occur at a very satisfy-
ing level after 10 minutes of gameplay. In (Synnaeve
et al., 2012), a method is presented for discovering
tactics and strategies in the game StarCraft. A cre-
ated dataset was analyzed by the technique Gaussian
Mixture Model, to identify every single cluster (army
formation) that will indicate the details of each army
composition, and these details go on to provide a way
of finding the best army formation. In (Weber and
Mateas, 2009), a study is put forward that tries to cre-
ate a model that represents a general enemy based on
several others, in order to be used in the prediction of
strategic action. In (Weber and Mateas, 2009) was
used techniques for classification as K-Nearest Neigh-
bor, Non-Nested generalized exemplars and Additive
Logistic Regression. The works described in (
´
Alvarez
Caballero et al., 2017; Synnaeve et al., 2012; Weber
and Mateas, 2009) show significant differences to the
present paper, since they do not select attributes for
a better representation of the game phases, and they
apply a supervised learning process, which is not the
case in the present paper.
In (Vallim, 2013), the authors use M-DBScan to
detect player behavior changes in RTS game, in first
person shooter game and in artificial datasets, show-
ing the diversity and success in the usage of this
technique. However, such work differs from the ap-
proach presented herein for the following reasons: it
deals with artificially controlled situations in which
the timestamp between every pair of successive nov-
elties is constant; and it only copes with game sce-
narios for which the dynamics can be appropriately
represented by a unique set of features, using just
one MC to investigate eventual changes. In the study
presented in (Vieira et al., 2019) the algorithm M-
DBScan was used with data from the game StarCraft
to show that the technique works in controlled test
scenarios that, like (Vallim, 2013), just used one MC.
4 DYNAMIC NOVELTY
DETECTION IN StarCraft
The main contribution of the approach proposed in
this study is to adapt the use of the algorithm M-
DBScan to improve its capacity in detecting novelties
in the game StarCraft. Such novelties concern sig-
nificant alterations on the game scenarios that even-
tually may point out an opponent’s strategic change.
As presented in this section, the basis of this approach
consists on using distinct and appropriate sets of fea-
tures to represent the game scenarios according to the
peculiarities of the following game stages: Beginning
Stage of the Battle (BSB), in which the adversaries
begin the conflict; MSB Middle Stage of the Battle
(MSB), in which deaths begin to occur in one of the
armies involved in the battle; and, finally, FSB Final
Stage of the Battle (FSB), which can be characterized
by the significant reduction in the number of battle
units that compose the armies involved in the contest.
In order to represent the game scenarios, the authors
use sub-sets of features extracted from a universe set
of 16 features, being such selection based on their
own experience with the game. In this way, during
the process of detecting eventual changes in the op-
ponent’s strategy, the present approach has to build 3
distinct MCs, one for each game stage. This approach
Adapting the Markov Chain based Algorithm M-DBScan to Detect Opponents’ Strategy Changes in the Dynamic Scenario of a StarCraft
Player Agent
217