The velocity of a particle p
i
is based on the best posi-
tion already fetched by the particle,
−→
p
best
(t), and the
best position already fetched by the set of neighbors
of p
i
,
−→
R
h
(t), that is a leader from the repository. The
velocity update function, in time step t + 1 is defined
as follows:
−→
v (t +1) = ϖ ∗
−→
v (t)+
+ (c
1
∗ φ
1
) ∗ (
−→
p
best
(t) −
−→
x (t)) +
+ (c
2
∗ φ
2
) ∗ (
−→
R
h
(t) −
−→
x (t)) (3)
The variables φ
1
and φ
2
, in Equation 3, are coeffi-
cients that determine the influence of the particle’s po-
sitions. The constants c
1
and c
2
indicates how much
each component influences on the velocity. The coef-
ficient ϖ is the particle inertia and controls how much
the previous velocity affects the current one.
−→
R
h
is a
particle from the repository, chosen as a guide of p
i
.
There are many ways to make this choice. At the end
of the algorithm, the solutions in the repository are
the final output. One possible way to make the leader
choice is called the sigma distance (Mostaghim and
Teich, 2003).
2.2 MOPSO-N Algorithm
MOPSO-N was proposed to handle with both numer-
ical and discrete attributes. In this way, it can be used
in different domains, mainly those ones with contin-
uous attributes. It was first introduced in (Carvalho
and Pozo, 2008). In this section, MOPSO-N aspects,
such as representation and generation of the particles
are described.
The algorithm uses the Michigan approach where
each particle represents a single solution or a rule.
In this context, a particle is an n-dimensional vector
of real numbers. One integer number represents the
value for each discrete attribute and two real numbers
represent an interval for numerical attributes. The in-
terval is defined by its lower and upper values. Each
attribute can accept the value ’?’, which means that,
for that rule, the attribute does not matter for the clas-
sification. In the proposed approach the class value
is set in the beginning of the execution. To represent
the particle as a possible solution, the attributes values
must be codified into real numbers. The codification
of the discrete attributes is conceived by real numbers
related to each attribute value of the database. When a
numerical attribute has a void value, their cells in the
particle representation receive the lower bound value
available on the database.
The rule learning algorithm using MOPSO-N
works as follows. The initialization procedure ran-
domly spreads all particles in the search space and
initializes the components of the particle. In this
process, the discrete attributes are defined by using
a roulette procedure, where the most frequent val-
ues have higher probabilities. The probability of the
generic value of each attribute, ’?’, is a function of the
number of possible values for the attribute. For the
numerical attributes, first, all attributes have the prob-
ability to be empty, prob empty. If an attribute is set
as non-empty, the lower and upper limits are spread
randomly in the interval defined by the minimum and
maximum values for the attribute (obtained from the
data set). After then, the particle’s components, like
velocity and local leader are initialized.
Once the initial configuration is performed, the
evolutionary loop is executed performing the moves
of all particles in the search space. In this work the
stop criterion is the maximum number of generations.
In each iteration, initially, the operations discussed in
the previous section are implemented.
In the position update, a mod operator is applied.
This operation is used to limit the particle into the
search space and to to promote equal probability of
selection to each attribute values. For discrete at-
tributes, the values are bounded to the number of val-
ues for each attribute. For the numerical ones, a mod
operator is proposed. It is executed using the maxi-
mum and minimum values of the attribute. If the new
upper limit overflows the maximum value, the excess
is added to the minimum value, and that is the new
limit. After this process, the smaller value is the new
lower limit, and the larger is the upper. If both val-
ues overflow the limits, the attributes are set to empty
(’?’).
After all particles have been moved through the
search space, they are evaluated using the objectives
and once again the best particles are loaded in the
repository and the global leaders are redefined. A
procedure checks for more general or specific rules
in the repository before a rule is added to it. A rule
is more specific than other rule if it has less attribute
constraints and the same contingency table. Only the
more general rules are kept in the archive.
At the end of the execution, the rules are usually
aggregated into a rule set to build a classifier. After,
the classifier can be used to classify unseen instances.
The classification of a new instance is performed by
a voting process. In the voting process, all rules vote
the class of the instances. The process contains the
following steps: for each class, all rules that cover
the input instance are identified. The identified rules
are sorted according to some ordering criteria. This
work uses the Laplace Accuracy, discussed in (Yanbo
J. Wang and Coenen, 2006). The ordering process is
applied to allow the selection of only the best k rules,
according to the ordering criteria, to vote the class of
ICEIS 2010 - 12th International Conference on Enterprise Information Systems
316