
node X
i
and N
ijk
are the occurrences in D of X
i
with
state k and fathers configuration j.
2.4 The PC algorithm
This algorithm is based on a constraint satisfaction
approach (Spirtes, 2001). The PC procedure
consists of an initialization phase where a fully
connected DAG, associated to a domain X, is set
up and an iterative phase that searches the implicit
relations of independence between the samples. In
every iteration we consider a set C(X,Y) of
adjacent nodes to X without Y with cardinality
greater or equal to the current n value. So for every
subset S, with cardinality n and extracted from C,
the algorithm carries out the order n statistical test
in order to determine if X and Y are d-separated
from S. In the affirmative case the arc X-Y is
removed and a new S set is examined with the
same procedure. After the investigation of all
possible S in C the n value is increased and the
algorithm is repeated until C has cardinality greater
or equal to n. In order to determinate the arcs
orientation the algorithm uses consideration based
on conditional independence.
2.5 TPDA Algorithm
The TPDA algorithm is a dependence-based
algorithm. It divides the process of learning in
three phases: Drafting, Thickening and Thinning.
The Drafting phase produces an initial relations set
through test on cross entropy value between the
variables of the domain. After this phase we obtain
a graph where it is present only a path between two
nodes. The second phase, "thickening", adds arcs
to the single connected graph if it is not possible to
d-separate two nodes. The resulting graph contains
all arcs of the true model and some extra-links.
These false arcs are produced by errors in the test.
The third phase, "thinning", consists in the
examination of all arcs and its exclusion if the two
nodes are conditionally independent. At the end of
this phase the algorithm orients arcs with an
approach similar to PC algorithm.
3 EXPERIMENTAL RESULTS
The main idea of this paper is to compare some of
most important structural learning algorithms. We
have implemented all algorithms previously
described and we have tested them using seven
bayesian networks and their relative datasets. A
briefly description of networks and datasets is
showed in the next paragraph.
3.1 Test Networks description
We have selected seven networks and their related
dataset in order to test the algorithms previously
described. In table 1 there is a briefly description
of all selected networks and related datasets.
Table 1: Analysed Networks and Datasets
Network Name
Nodes
Number
Arcs
Number
Data Set
Samples
Alarm (Pearl, 1991) 37 46 10.000
Angina (Cooper, 1992) 5 5 10.000
Asia (Glymour, 1987) 8 8 5.000
College (Singh, 1995) 5 6 10.000
Led (Fung, 1990) 8 8 5.000
Pregnancy (Buntime, 1996) 4 3 10.000
Sprinkler (Suzuki, 1999) 5 5 400
We used the previously described algorithms on
these networks. We have experimented the
algorithms using two different sorting for the
nodes of the networks: ordered (correct sorting of
node starting from fathers to sons) and inverse. We
have choosen two different sorting in order to test
in any case the performances of algorithms. We
have defined two indexes:
Topological Learning =
Correct Arcs
Correct Arcs+ Missing Arcs+ Added Arcs
∑
∑∑ ∑
Global Learning =
Correctly Oriented Arcs
Correctly Oriented Arcs+ Wrongly Oriented
Arcs+ Added Arcs+ Missing Arc
∑
∑∑∑∑
The first index measures the ability of the
algorithm in the learning of correct topology of the
net. The second index, instead, measures the ability
of the algorithm in the learning of correct networks
(topology and correct orientation of arcs). In
figures 1 and 2 we have depicted the average
indexes values obtained by every algorithm in the
learning processes of the various networks. In
figure 1 we have the results for ordered nodes with
and in figure 2 the results for inverse nodes.
Algorithms that are based on a scoring function
maximization approach have the best results in the
case of ordered starting structure. In particular the
K2 algorithm has the best performance: 88% as
topological index and 88% for the global index.
The constraint-based algorithms have the worst
results and show an important difference between
the two indexes (18% for PC and 24% for TPDA).
So we can say that these algorithms also when are
able to identify the topology of the network often
BAYESIAN NETWORK STRUCTURAL LEARNING FROM DATA: AN ALGORITHMS COMPARISON
529