(qualitative part), and these relationships are also
modelled with a second element, quantitative, that
forms a BN: probability distributions. In our opinion,
there are two main characteristics that have made
BNCs so popular: they provide predictions in terms
of probabilities (that could be interpreted as weights)
and they are easily and intuitively interpreted by non-
experts (thanks to the underlying graph). Empirically,
BNCs have also been successfully in many applica-
tion areas (Flores et al., 2012) such as Computing,
Robotics, Medicine, Healthcare, Finance, Banking
and Environmental Science.
This paper is organized as follows. In Section 2,
we review some previous work related to the current
one, and discuss the novelty of our approach. In Sec-
tion 3, we present the first part of our experiments,
where we analyse the behaviour of some BNCs in a
benchmarkof imbalanceddatasets. Section 4 presents
the final experimentation of the paper, from which
we conclude which is the algorithm to apply. Some
conclusions from these results are also given. Finally,
Section 5 provides a general discussion and future re-
search lines.
2 RELATED WORK
In (Lopez et al., 2013), authors present a very interest-
ing study where they identify which are the intrinsic
characteristics that affect when applying supervised
classification models on imbalanced datasets. In this
work authors pre-select a set of 66 datasets (subset
of those in Table 1). This work proves that Imbal-
ance Ratio (IR) is, of course, a very important factor
when working with imbalanced datasets, however,the
performance of classifiers can not be obtained for a
simple (linear) function with respect to this measure.
They see how other aspects can also influence, such
as the presence of small disjoints, the lack of density
in the training data, the overlapping between classes,
the identification of noisy data, the significance of the
borderline instances, and the dataset shift between the
training and the test distributions. The problem with
these other measures is that obtaining them is not an
easy task, and when it can be done with approximate
values, they suppose a high computing cost. Besides,
these depend on the particular problem to solve, while
we are interested in developing generaltechniques ap-
plicable to any dataset. That is why, we will firstly
work on the behaviour with respect to the IR value, in
combination with other graphical tools and plots.
The novelty of the current work is that it is
uniquely focused on the behaviour on BNCs, since
most of the works related to imbalanced datasets are
devoted to other kind of models, for example (Lopez
et al., 2013) uses Decision trees (C4.5), Support vec-
tor machines (SVMs) and the k-Nearest Neighbours
(kNN) model, which goes into the family of Instance
based learning. On the other hand, (Sun et al., 2007)
applies two kinds of systems: again C4.5 decision
tree and an associative classification system called
HPWR (H
igh-order Pattern and Weight-of- evidence
R
ule based classifier). We considered that there is an
open research line in the study of the behaviour of
BNCs with imbalanced data and performing a study
of how to approach this problem is the main aim of
this paper.
Another related work, where BNCs are used is
(Wasikowski and wen Chen, 2010), but this is not ap-
plicable to the available datasets, since the number of
attributes in our problems are too low, and the num-
ber of instances is comparatively few, so it does not
have sense to apply feature subset selection. In the
referred work they use datasets which are not origi-
nally imbalanced, and we wanted to apply BNCs to
real imbalanced datasets.
3 ANALYTICAL EXPERIMENTS
Here we will show the basis for our study and an ini-
tial experimental set-up, analysing their results.
3.1 Selected BNCs
The classification task consists of assigning one cate-
gory c
i
or value of the class variable C = {c
1
,...,c
k
}
to a new object ~e, which is defined by the assign-
ment of a set of values, ~e = {a
1
,a
2
,··· ,a
n
}, to the
attributes A
1
,...,A
n
. In the probabilistic case, this
task can be accomplished in an exact way by the ap-
plication of the Bayes theorem (Equation 1). How-
ever, since working with joint probability distribu-
tions is usually unmanageable, simpler models based
on factorizations are normally used for this problem.
In this work we apply four computationally efficient
paradigmsNaive Bayes (NB), KDB,TAN and AODE.
Our experimentswill also show that imbalanced prob-
lems does not benefit from more complex BNCs.
p(c|~e) =
p(c)p(~e|c)
p(~e)
. (1)
These models, as any BN, are represented by a Di-
rected Acyclic Graph, whose nodes indicate variables
and the presence/absence of edges imply their rela-
tionships, that can be obtained, with the d-separation
concept (Korb and Nicholson, 2010). For exam-
ple, from the structure of NB (Figure 1.(a)), when
ImpactonBayesianNetworksClassifiersWhenLearningfromImbalancedDatasets
383