than others. In particular, cases where the left and/or
right child vectors are tip sequences can be handled
more efficiently. For instance, an observed nucleotide
A
at a tip sequence corresponds to a simple probability
vector of the form [P(A) := 1.0, P(C) := 0.0,P(G) :=
0.0,P(T) := 0.0]. This property of tip vectors can be
used for saving computations in equation 1.
Also, typical topological search operators for find-
ing/constructing a tree topology with an improved
likelihood score such as SPR (Subtree Pruning and
Re-grafting), NNI (Nearest Neighbor Interchange) or
TBR (Tree Bisection and Reconnection) only apply
local changes to the tree topology. In other words,
the majority of the ancestral probability vectors is not
affected by the topological change and does there-
fore not need to be recomputed/updated with respect
to the locally altered tree topology via a full post-
order tree traversal. Therefore, if all ancestral vec-
tors reside in RAM, only a very small part of the
tree needs to be (re-)traversed after a SPR move, for
instance. All standard ML-based programs such as
GARLI (Zwickl, 2006), PHYML 3.0 (Guindon et al.,
2010), and RAxML (Stamatakis, 2006) deploy search
strategies that typically require updating only a small
fraction of probability vectors in the vicinity of the
topological change.
Therefore, devising an appropriate strategy (see
Section 3.4) for deciding which vectors shall remain
in RAM and which can be discarded (because they
can be recomputed at a lower computational cost)
can have a substantial impact on the induced run
time overhead when holding, for instance, x := n/2
vectors in RAM. In the following, we will outline
how to compute the PLF and conduct SPR-based tree
searches with x < n vectors in RAM.
Let n be the number of species, n− 2 the number
of ancestral probability vectors, and x the number of
available slots in memory, where log
2
(n) + 2 ≤ x <
n (i.e., n − x vectors are not stored, but recomputed
on demand). Let w be the number of bytes required
for storing an ancestral probability vector (all vectors
have the same size). Our implementation will only
allocate x· w bytes, rather than n· w. We henceforth
use the term slot to denote a RAM segment of w bytes
that can hold an ancestral probability vector.
We define the following C data structure (details
omitted) to keep track of the vector-to-slot mapping
of all ancestral vectors and for implementing replace-
ment strategies:
typedef struct
{
int numVectors;
size_t width;
double **tmpvectors;
int *iVector;
int *iNode;
int *unpinnable;
boolean allSlotsBusy;
unpin_strategy strategy;
}recompVectors;
The array
tmpvectors
is a list of pointers to
a set of slots (i.e., starting addresses of allocated
RAM memory) of size
numVectors
(x) and width
numVector
(w). The array
iVector
has length x and
is indexed by the slot id. Each entry holds the node
id of the ancestral vector that is currently stored in the
indexed slot. If the slot is free, the value is set to a
dedicated
SLOT_UNUSED
code. The array
iNode
has
length n− 2 and is indexed using the unique node ids
of all ancestral vectors in the tree. When the corre-
sponding vector resides in RAM, its array entry holds
the corresponding slot id. If the vector does not re-
side in RAM the array entry is set to the special code
NODE_UNPINNED
. Henceforth, we denote the avail-
ability/unavailability of an ancestral vector in RAM
as
pinned/unpinned
The array
unpinnable
tracks
which slots are available for unpinning, that is, which
slots that currently hold an ancestral vector can be
overwritten, if required.
The set of ancestral vectors that are stored in the
memory slots changes dynamically during the com-
putation of the PLF (i.e., during full tree traversals
and tree searches). The pattern of dynamic change in
the slot vector also depends on the selected recom-
putation/replacement strategy. For each PLF invoca-
tion, be it for evaluating a SPR move or completely
re-traversing the tree, the above data structure is up-
dated accordingly to ensure consistency.
Whenever we need to compute a local tree traver-
sal (following the application of an SPR move) to
compute the likelihood of the altered tree topology,
we initially just compute the traversal order which is
part of the standard RAxML implementation. The
traversal order is essentially a list that stores in which
order ancestral probability vectors need to be com-
puted. In other words, the traversal descriptor de-
scribes the partial or full post-order tree traversal re-
quired to correctly compute the likelihood of a tree.
For using x < n vectors, we introduce a so-called
traversal order check, which extends the traversal
steps (the traversal list) that assume that all n vec-
tors reside in RAM. By this traversal order extension,
we guarantee that all missing vectors (not residing in
RAM) will be recomputed as needed. The effect of
reducing the number of vectors residing in RAM is
that, traversal lists become longer, that is, more nodes
are visited and thereby run time increases. When the
traversal is initiated, all vectors in the traversal list
that already reside in RAM (they are pinned to a slot)
are protected (marked as
unpinnable
) such that, they
BIOINFORMATICS 2012 - International Conference on Bioinformatics Models, Methods and Algorithms
90