find a great diversity of trees across the runs. The lat-
ter suggests inconsistent behavior when ants are caus-
ing edges to be created, possibly due to differently
scored trees being visited. Despite this, the difference
in log-likelihood is small and the nodes where the de-
gree is largest are those that score higher. Therefore,
while different breadths of tree diversity is being ac-
quired, and while different trees are being visited be-
tween replicate runs, the ability of the algorithm to
sample similarly scoring trees and regions does not
appear to change. Notice, also, that this inference is
less relevant if we discuss the synthetic dataset.
Being a smaller search space, the algorithm ap-
pears to sample the trees found in the smaller space
thoroughly. Respectively, the algorithm explored a
number of the possible trees in the empirical land-
scape on the order of 10
−15
% and of the synthetic
dataset on the order of 1%. As the number of trees in
the space is equal to O(n!), this shows that although
the algorithm is not searching a very large proportion
of all possible trees, it is sampling a number of them
that is sufficient to acquire a shape of the landscape.
When choosing different starting topologies (Ta-
ble 2), the deviations in number of bipartitions and
average log-likelihood are magnified between both
empirical and synthetic data. It appears that when a
different, possibly worse starting topology, is chosen,
more iterations need to be done in order to acquire
more of a diversity in splits and to bring resultant
landscapes consistently into regions of better scoring
topologies. However, when a good path is found, the
energy expenditure function should mitigate this ef-
fect.
When the AU test confidence sets of trees were
computed for, the number of trees found to be present
in these sets were similar for respective sets of param-
eters and for both emperical and synthetic data. Even
when starting topologies were varied, the number of
trees found to be part of these sets did not seem to be
reduced from those found by starting at a well-scoring
topology. This suggests a tendency for the search to
find well-scoring regions of trees.
4 CONCLUSION
This metaheuristic was designed to sample a large
number of regions of interest of the search space with
a reasonable number of iterations and amount of time.
In order to acquire an understanding of its perfor-
mance, a number of parameters possessed by the al-
gorithm can be tuned. We found that evaporation was
effective in steering the search to well-scoring regions
of the space, the number of ant agents extended the
number of trees found, and that the highest scoring
trees in the search were visited more often as indi-
cated by their increased degree.
Two rounds of experimentation were carried out
including a first round testing for different triplets of
parameter values. The second round of experimen-
tation saw the investigation of replicate runs and the
starting topology being varied. All results show that,
when exploring both empirical and synthetic data, we
can make three claims about the performance of the
proposed algorithm. Firstly, the PLACO algorithm is
capable of broadly exploring the combinatorial space
in spite of the number of taxa. Secondly, across repli-
cate runs we find consistent behaviour but variation in
quality of trees. This implies a sparse but broad search
where different topologies are being found. Thirdly,
it does not matter where the algorithm starts in order
to acquire a wide ranging set of trees and to sample
properties of and the shape of the space.
Future work investigate the maintenance of multi-
ple populations in the space. For example, we could
build into the algorithm an ability for it to iteratively
create colonies. This can accomplish to more densely
move across the space and focus on regions of partic-
ular interest.
REFERENCES
Albright, E., Hessel, J., Hiranuma, N., Wang, C., and Go-
ings, S. (2014). A comparative analysis of popu-
lar phylogenetic reconstruction algorithms. Midwest
Instruction and Computing Symposium (MICS) 2014
Proceedings.
Bastert, O., Rockmore, D., Stadler, P. F., and Tinhofer, G.
(2002). Landscapes on spaces of trees. Applied math-
ematics and computation, 131(2):439–459.
Blum, C. and Roli, A. (2003). Metaheuristics in combinato-
rial optimization: Overview and conceptual compari-
son. ACM Comput. Surv., 35(3):268–308.
Charleston, M. A. (1995). Toward a characterization of
landscapes of combinatorial optimization problems,
with special attention to the phylogeny problem. Jour-
nal of Computational Biology, 2(3):439–450.
Dorigo, M., Di Caro, G., and Gambardella, L. M. (1999).
Ant algorithms for discrete optimization. Artificial
life, 5(2):137–172.
Felsenstein, J. (2004). Inferring phylogenies, volume 2.
Sinauer Associates Sunderland.
Fitch, W. M. (1971). Toward defining the course of evo-
lution: minimum change for a specific tree topology.
Systematic Biology, 20(4):406–416.
Foulds, L. R. and Graham, R. L. (1982). The steiner prob-
lem in phylogeny is np-complete. Advances in Applied
Mathematics, 3(1):43–49.
Luke, S. (2013). Essentials of Metaheuristics. Lulu, second
edition.
ApplicationofAntColonyOptimizationforMappingtheCombinatorialPhylogeneticSearchSpace
199