A New Physarum Learner for Network Structure Learning
from Biomedical Data
T. Sch
, M. Stetter
, A. M. Tom
and E. W. Lang
CIML Group, Biophysics, University of Regensburg, Regensburg, Germany
University of Applied Science Weihenstephan-Triesdorf, Freising, Germany
IEETA, DEETI, University of Aveiro, Aveiro, Portugal
Bayesian Network, Structure Learning, Physarum Solver, LAGD Hill Climber.
A novel structure learning algorithm for Bayesian Networks based on a Physarum Learner is presented. The
length of the connections within an initially fully connected Physarum-Maze is taken as the inverse Pearson
correlation coefficient between the connected nodes. The Physarum Learner then estimates the shortest in-
direct paths between each pair of nodes. In each iteration, a score of the surviving edges is incremented.
Finally, the highest scored connections are combined to form a Bayesian Network. The novel Physarum
Learner method is evaluated with different configurations and compared to the LAGD Hill Climber showing
comparable performance with respect to quality of training results and increased time efficiency for large data
In 2000, Nakagaki et al. (Nakagaki et al., 2000)
showed that the slime mould Physarum polycephalum
can find the shortest path between two points in a
maze. In the following years, research efforts focused
on examining the detailed strategy used by Physarum
to understand the adaptive dynamics of its transport
network (Nakagaki et al., 2001; Nakagaki, 2001;
Nakagaki et al., 2004). Tero et al. (Tero et al., 2006;
Tero et al., 2007) proposed a physical model for the
underlying transport network based on hydrodynam-
ics and showed that their model (Physarum Solver)
can solve the shortest path problem of the maze intro-
duced by Nakagaki (Nakagaki et al., 2000) in a man-
ner similar to the biological slime mould. A detailed
mathematical analysis and studies of the convergence
properties of the Physarum Solver are discussed in
(Tomoyuki and Isamu, 2008; Ito et al., 2011; Brum-
mitt et al., 2010) Apart form path finding problems,
new applications for the Physarum Solver are now be-
ing investigated. In this paper, we present a novel ap-
proach where the strategy used by Physarum Solver
is adapted to a well known NP-hard problem, namely
learning Bayesian Network structure from data.
Three real binary data sets are used for evaluations,
Asia (8 nodes, 8 arcs, average in-degree of 2)(Lau-
ritzen and Spiegelhalter, 1988), Cancer (Korb and
Nicholson, 2010) and Earthquake (5 nodes, 4 arcs,
average in-degree of 1.6, each) (Korb and Nicholson,
2010). Furthermore, a set of 15 binary networks has
been created randomly with different characteristics
using the Bayesian Network sampler implemented in
WEKA. The networks are denoted as nXpYaZ where
X is replaced by the number of nodes, Y by the num-
ber of allowed parents and Z by the number of arcs.
The network configurations can be seen in Table 1.
For each network, WEKA was used to sample data
sets randomly with 1000 instances each.
Tero et al. (Tero et al., 2007) introduced a graphical
model for the maze created by Nakagaki (Nakagaki
et al., 2000).
To a graph where at each branch in the maze, a
node N is inserted. If the maze contains a direct way
from node N
to N
, a connection between the nodes
is added. We refer to this kind of graph as Physarum-
Schön T., Stetter M., M. Tomé A. and W. Lang E..
A New Physarum Learner for Network Structure Learning from Biomedical Data.
DOI: 10.5220/0004227401510156
In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing (BIOSIGNALS-2013), pages 151-156
ISBN: 978-989-8565-36-5
2013 SCITEPRESS (Science and Technology Publications, Lda.)
For a wide-sense stationary system, the mass flux
i j
through the tube segment M
i j
between two nodes
and N
follows Fick’s first law
i j
= D
i j
i j
i j
i j
) = P
i j
) (1)
where D
i j
is the conductivity, L
i j
is the length of
the edge M
i j
and P
i j
its permeability. Finally, the pres-
sures at node N
and N
, representing free energy den-
sities, are denoted by p
and p
. By considering mass
conservation following Kirchhoffs first law, the Pois-
son equation for the pressures result. The flux through
each edge is controlled by a time-dependent conduc-
tivity D
i j
(t) following a Master equation which mod-
els a balance between a positive feedback term, caus-
ing conductivity to increase with increasing flux, and
a relaxation term which occasionally causes edges to
vanish from the graph.
3.1 Building a Physarum-Maze from
Sampling Data
A Physarum-Maze is generated by adding, for each
attribute in the data set, a node to the network, and
adding an undirected connection between each pair of
nodes, so that the maze is fully connected. The con-
ductivity D
i j
of each connection M
i j
is initialized ran-
domly in the range of D
i j
. Estimating
the length L
i j
of the connection between two nodes is
the key task of learning a Bayesian Network struc-
ture with the Physarum Solver. Here we propose that
the length of the connection represents the level of
independence of two adjacent nodes in the network.
The latter are taken to be nominal or ordinal, and
the connections in the Physarum Solver are consid-
ered undirected. Node independence is approximated
by a correlation coefficient ρ
i j
between the connected
nodes. Therefore, a symmetric correlation coefficient
for nominal data is needed. As the main goal of this
study is to examine whether it is generally possible to
learn Bayesian Network structure with the Physarum
Solver, for the sake of simplicity we restricted the
benchmark data sets to be binary. This restriction al-
lows to use the simple normalized correlation coeffi-
cient φ given by (Davenport and El-Sanhurry, 1991)
φ =
(A + B)(C + D)(A +C)(B + D)
where the normalized value 1 φ
1 is
calculated with
, if B > C
, else
. (3)
The coefficient φ uses a two-dimensional Contin-
gency table of the binary variables X and Y , see Table
Table 1: Contingency table.
X \Y y1 y0 Total
x1 A B A+B
x0 C D C+D
Total A+C B+D N
The normalized correlation coefficient is inter-
preted in a similar way as the Pearson correlation co-
efficient. For each connection, the length L
i j
node N
and N
is given by
i j
= (10(1 |φ
norm,(i, j)
|+ l))
The control parameter γ assures growth of the
length L
i j
of the segment if γ > 1.0. The correla-
tion coefficient φ is subtracted from 1 to yield short
segment lengths between highly correlated nodes. A
constant l 0.1 is added to avoid connections with
vanishing length L
i j
0 if φ 1.0.
3.2 The Physarum Learner
Two additional nodes, denoted as source and sink
node, are added to the Physarum-Maze which, ini-
tially, are not connected to any node in the maze.
Furthermore, the Physarum connections M
i j
are com-
pleted by a score s that is initialized with s
i j
0 M
i j
M . The Physarum Learner iterates over all
possible connections between two nodes in the maze.
For each pair of nodes N
, the direct connec-
tion M
i j
between N
and N
is removed. The source
node N
is connected to N
while N
is connected
to the sink node N
with length L
= L
= L
The value of the constant L
can be chosen arbi-
trarily, because there is only one connection to the
source and one connection to the sink node which,
therefore, survive in any case, independent of their
length. Then the Physarum Solver (Tero et al., 2008)
is applied to the maze. The score of the connections
which remained in the network after one epoch of the
Physarum Solver is increased by 1. Next the previ-
ously removed connection between N
and N
is rein-
serted. The source and sink nodes are disconnected
from N
and N
and re-assigned to other, randomly
chosen nodes. Finally, the conductivities D
i j
are re-
set before the next epoch is started. After all itera-
tions have finished, each node has been connected to
the source and to the sink node exactly N 1 times.
The result of the Physarum Learner is an undirected
and fully connected graph, where each connection
has a score 0 s
i j
. A score s
i j
= 0 indi-
cates that the connection M
i j
has never survived in
any of the Physarum Learner epochs. As the length
of connections are indirectly proportional to the cor-
relation between the connected nodes, a connection
between highly correlated nodes is expected to sur-
vive more often than a connection between uncorre-
lated nodes. In other words, the Physarum Learner
forces the Physarum Solver to find indirect paths that
explain the correlation between two nodes, by block-
ing the direct path between these two nodes.
3.3 Generate Bayesian Network from
The basic problem on transforming the scored graph
into a Bayesian Network is that the connections es-
tablished by the Physarum Learner are undirected
whereas a Bayesian Network requires directed edges.
Hence, learning a Bayesian Network from data by the
Physarum Learner is only possible if a correct order-
ing of the nodes is known. Otherwise, the Physarum
Learner generates an undirected graphical model of
the data set, also called Markov network. Transform-
ing a Markov network into a Bayesian Network can be
done by triangulation which, however, often results in
a loss of independence information (Koller and Fried-
man, 2009).
The graphical network is built from the Physarum
Learner output in the following way: A list of con-
nections is created and sorted by their score so that the
connection with the highest score is the first entry. An
empty graphical model is initialized. The connections
are added to the graph in the sequence of the list. The
highest scored connection is added first, followed by
the second highest scored connection and so on un-
til the score falls below a given threshold θ. In this
work, θ is set to the arithmetic average of the scores
of all connections whenever score 6= 0 for each simu-
lated network. Unconnected nodes, i. e. nodes whose
score falls below the threshold, are added afterwards
according to their highest scored connection.
In case where the ordering is known and a
Bayesian Network is to be learned, the connections
are already directed from N
to N
. Therefore, the
Physarum Learner only contains correctly directed
connections. In case of unknown ordering, however,
a connection is only added to the Bayesian Network if
it does not cause a directed cycle in the graph. After
the structure of the network has been established from
the data, the WEKA software package is employed to
learn the Bayesian Network parameters.
4.1 Configurations of the Physarum
Structure learning results for 18 different test net-
works are quantified with the Bayesian score BDeu
and the information theoretic score MDL which
is equivalent to the Bayesian information criterion
(BIC) score. Further, we compare the learnt struc-
tures with the known original structures (the ones we
sampled the test data from) by measuring the number
of learnt (A), extra (E), missing (M) and reversed (R)
arcs. The conductivities D
i j
of the connections have
been initialized randomly in the range D
= 0.5
i j
= 1.0. The constant l in eqn.(4) was fixed
at l = 0.1.
The Physarum Learner has been applied to all 18
networks employing different values for γ. The fol-
lowing simulations have been performed: Each of
the 18 networks has been learnt three times setting
γ = 1.0, 2.0, 3.0. In these simulations, the best scored
network has been learnt 3 times with γ = 1.0, 8 times
with γ = 2.0 and 8 times with γ = 3.0. The network
structure closest to the original one has been learnt 3
times with γ = 1.0, 9 times with γ = 2.0 and 8 times
with γ = 3.0. The results corroborated that the value
of γ clearly influences the quality of the results. It
turned out that a larger γ exponent increases the devi-
ations of length values. This results in the selection of
paths over a larger number of nodes in the Physarum
Solver. Additional results, not presented in this paper,
show that for small networks the performance is bet-
ter for γ = 2.0, but for larger networks (in terms of the
number of arcs) γ = 3.0 leads to better results. Prelim-
inary tests also showed a clear advantage in learning
performance if the positive feedback term of the mas-
ter equation was modelled as f (Q) = |Q|
Next, the parameter µ was set to µ =
0.8, 1.0, 1.2, 1.4, 1.8 while again learning each
of the 18 networks. As the results for µ 6= 1.0
depend on the randomly initialized values of the
conductivities D
i j
, the results presented have been
averaged over 10 runs. A comparison indicating
which value for µ performed best is given in Table 2.
This table displays the number of cases out of 5 ×18
trials where a learned network was best in terms of
structure similarity to the true network (col.2), the
BDeu score (col. 3) and the MDL score (col. 4). The
most similar structure is identified as the one with the
lowest value of the sums of extra and missing arcs.
Table 2: Counts where values of µ performed best.
µ Structure BDeu MDL
0.8 4 5 6
1.0 13 11 10
1.2 9 7 7
1.4 9 5 5
1.8 6 9 9
The results clearly favour µ = 1.0 in all three met-
rics. Note that the learnt structures for µ 6= 1.0 dif-
fer in their performance, and that only average results
have been compared.
4.2 Comparison of Physarum Learner
and LAGD
Referring to the results presented, the Physarum
Learner is compared to the LAGD Hill Climbing
learning algorithm setting l = 0.1, γ = 2.0, f (Q) =
|Q|. Both learning algorithms are evaluated on 18
benchmark data sets. Again the number of learnt, ex-
tra, missing and reversed arcs is measured compared
to the original reference network from where the data
is sampled. Note that the Physarum Learner cannot
learn reversed arcs as attributes in the data sets are
already given in the correct order. Furthermore, the
BDeu and the MDL scores are computed also. The
results indicate that the LAGD Hill Climbing outper-
forms the Physarum Learner in all test cases with re-
spect to the BDeu and MDL score. This is not sur-
prising, as LAGD optimizes BDeu and MDL during
learning, while the Physarum Learner does not con-
sider any score during learning at all. But especially
in the three real networks, the Physarum Learner was
able to learn network structures with a quality com-
parable to LAGD. In particular, Physarum Learner
learned the exact true network for the Cancer data set,
where LAGD has indeed the better scores, but two of
the four arcs in the network are reversed.
Results also indicate that the Physarum Learner
tends to learn less arcs for networks with a higher
number of arcs. This is due to the fact, that the thresh-
old θ is simply the average of all non-vanishing con-
nection scores. The algorithm has no ability to fit the
threshold to the requirements of the dataset. A de-
tailed comparison of thresholds is needed but is sub-
ject to future work.
Removing the correct ordering from the datasets
and learning the networks under same configurations
again, the results for the three real datasets learned
with the Physarum Learner are shown in Table 3.
The results show a slight decrease of the scores com-
Table 3: Physarum Learner results for reordered real
Dataset A E M R BDeu MDL
Asia 9 3 2 2 -2475 -2498
Cancer 4 0 0 3 -2244 -2251
Earthquake 5 1 0 2 -542 -558
pared to the ordered measurements due to some re-
versed edges. As expected, the Physarum Learner
selected exactly the same arcs as in the ordered ver-
sion, because the direction of the arcs is not consid-
ered within the learner. Therefore, the number of ex-
tra and missing arcs is the same as for the ordered
datasets, but some arcs are reversed. When neglecting
the directions and no ordering is provided, the result
of the Physarum Learner can be seen as a Markov
Network instead of a Bayesian Network. To vali-
date the learning performance of Physarum Learner
for Markov Networks, we transformed the three real
Bayesian Networks to Markov Networks using moral-
ization (Koller and Friedman, 2009). The results are
given in table 4.
Table 4: Physarum Learner networks compared to moral-
ized true networks.
Dataset Mor. Arcs A E M
Asia 10 9 4 3
Cancer 5 4 0 1
Earthquake 5 5 1 1
The column Mor. Arcs displays the number of arcs
in the moralized network. For Asia, the learned net-
work is closer to the Bayesian Network version (2 ex-
tra and 2 missing arcs) than to the Markov Network
(4 extra and 3 missing arcs). The same behaviour
can be seen for Cancer, where Physarum Learner re-
constructed the true network skeleton in table 3 but
misses the extra arc created while moralization in ta-
ble 4. For Earthquake, one additional arc is learned,
but not the correct one.
Finally, the computational load for LAGD and
Physarum Learner for sample sizes of 1000, 10.000,
100.000, 1.000.000 and 10.000.000 has been tested.
It turned out that, although the complexity of the
Physarum Solver grows with the number of nodes as
, it is three times more efficient as the Physarum
Learner. This is because the latter needs to deal with
the data set only once when calculating
correlation coefficients.
Figure 1: NAFLD Net learned by the Physarum Learner.
4.3 Applying the Physarum Learner to
an NAFLD Dataset
Finally, the Physarum Learner performance is evalu-
ated on a medical dataset retrieved at the Medical Uni-
versity of Graz (MUG) from 29 patients with liver-
biopsy confirmed non-alcoholic fatty liver disease
(NALFD). Patient data safety is followed throughout
the entire process according to the standard operating
procedures at the MUG. From the dataset, the param-
eters present in less than 50% of the patients and pa-
rameters with a constant value over all patients have
been removed. The missing values of the remain-
ing 32 attributes have been inserted by modes and
means using the built in ReplaceMissingValues filter
of WEKA (http://www.cs.waikato.ac.nz/ml/weka/).
Further, the dataset has been extended to 2900 in-
stances by bootstrapping. As the dataset contains a
mixture of real, nominal and ordinal attributes, and
the Physarum Learner requires binary attributes, all
attributes have been transformed to binaries.
The Physarum Learner is applied with configu-
rations identical to the ones described in 4.2. The
learned network shown in Fig. 1 is evaluated by med-
ical experts of the MUG. As the Physarum Learner
can only learn undirected graphs, hence is insensitive
to the directions of edges, all interactions are consid-
ered undirected. All 42 connections displayed in Fig.
1 have been classified as comprehensible in a medi-
cal context. However, 35 possible connections, indi-
cated by general medical knowledge, have been iden-
tified as missing. For example, a connection between
Magnesium (Mg) and HDL Cholesterol. From a med-
ical point of view, the NAFLD graph generated by the
Physarum Learner correctly reflects current medical
knowledge and shows great potential for future appli-
cations and further refinement. The behaviour of the
Physarum Learner is in line with previous experience
showing that it tends to learn the stronger, hence more
important, connections of the network.
A novel structure learning algorithm for Bayesian
Networks has been introduced that initializes a fully
connected network and uses the Physarum Solver
to delete edges between nodes which are below a
given correlation threshold. The Physarum Learner
was compared to the LAGD Hill Climber where the
Physarum Learner could learn competitive network
structures for the three real networks, but performed
worse than LAGD with respect to the learned BDeu
and MDL score. It was shown, that it is generally
possible to learn the structure of a Bayesian Network
with the Physarum Solver. The Physarum Learner
shows strongly growing execution time for networks
with many nodes. This is in part because the algo-
rithm has not yet been optimized for speed and mem-
ory usage. However, for networks with a small num-
ber of nodes and a large number of data points, the
Physarum Learner is clearly more time efficient than
LAGD and probably other score based heuristic meth-
This work was supported by BioPersMed (COMET
K-project 825329), which is funded by the Federal
Ministry of Transport, Innovation and Technology
(BMVIT) and the Federal Ministry of Economics
and Labour/the Federal Ministry of Economy, Family
and Youth (BMWA/BMWFJ) and the Styrian Busi-
ness Promotion Agency (SFG). Valuable discussions
with E. Wichro, N. Lanner, R. Charchoghlyan and K.
Sargsyan are much appreciated.
Brummitt, C., Laureyns, I., Lin, T., Martin, D., Parry,
D., Timmers, D., Volfson, A., Yang, T., Yaple, H.,
and Rossi, M. L. (2010). A Mathematical Study of
Physarum polycephalum. The Tero Model.
Davenport, E. C. and El-Sanhurry, N. A. (1991).
Phi/Phimax: Review and Synthesis. Educational and
Psychological Measurement, 51(4):821–828.
Ito, K., Johansson, A., Nakagaki, T., and Tero, A. (2011).
Convergence Properties for the Physarum Solver.
Koller, D. and Friedman, N. (2009). Probabilistic graphical
models: principles and techniques. The MIT Press.
Korb, K. and Nicholson, A. (2010). Bayesian Artificial In-
telligence. Chapman and Hall, 2nd edition.
Lauritzen, S. L. and Spiegelhalter, D. J. (1988). Local
Computation with Probabilities on Graphical Struc-
tures and their Application to Expert Systems. Journal
of the Royal Statistical Society: Series B (Statistical
Methodology), 50(2):157–224.
Nakagaki, T. (2001). Smart behavior of true slime mold in a
labyrinth. Research in Microbiology, 152(9):767–770.
Nakagaki, T., Yamada, H., and Hara, M. (2004). Smart net-
work solutions in an amoeboid organism. Biophysical
Chemistry, 107(1):1–5.
Nakagaki, T., Yamada, H., and Toth, A. (2000). Intelli-
gence: Maze-solving by an amoeboid organism. Na-
ture, 407(6803):470.
Nakagaki, T., Yamada, H., and T
A. (2001). Path find-
ing by tube morphogenesis in an amoeboid organism.
Biophysical Chemistry, 92(1–2):47–52.
Tero, A., Kobayashi, R., and Nakagaki, T. (2006).
Physarum solver: A biologically inspired method of
road-network navigation. Physica A, 363(1):115–119.
Tero, A., Kobayashi, R., and Nakagaki, T. (2007). A math-
ematical model for adaptive transport network in path
finding by true slime mold. J. Theo. Biol., 244(4):553–
Tero, A., Yumiki, K., Kobayashi, R., Saigusa, T., and Naka-
gaki, T. (2008). Flow-network adaptation in Physarum
amoebae. Theory in Biosciences, 127(2):89–94.
Tomoyuki, M. and Isamu, O. (2008). Physarum can solve
the shortest path problem on riemannian surface math-
ematically rigorously. Int. J. Pure and Appl. Math.,
(47 (3)):353–369.