which has by far received the most attention over the
years.
To our knowledge there exists only one parallel
version of the Simulated annealing algorithm known
as the Parallel Tempering or Monte Carlo Replica Ex-
change algorithm (Swendsen and Wang, 1986; Earlab
and Deem, 2005). In Parallel tempering (PT) many
simulations, or replicas, are started and run in paral-
lel. The solutions are sampled in the same fashion as
in the regular simulated annealing approach by mak-
ing small alterations to the solution and accepting the
change with a certain probability. However, instead
of lowering the temperature like in the simulated an-
nealing approach the simulations are run at different
but steady temperatures throughout the search. Two
replicas may be swapped with a probability that de-
pends on both differences in energy and temperatures,
such that a replica running at a lower temperature can
be exchanged with a replica running at a higher tem-
perature, thereby giving replicas a greater chance of
overcoming local minima barriers.
For genetic algorithms there exists several parallel
variants that generally offer significant improvements
by converging faster at often better solutions than the
non-parallel version. There are two major approaches
to parallelizing genetic algorithms. One is often re-
ferred to as the master-slave model, where a single
process (the master) controls the genetic algorithm,
but uses a number of other processes (the slaves) to
evaluate and possible breed the individuals. The slave
processes are run in parallel.
The other parallelization paradigm frequently
used is the niche model (also known as the island
hopping or deme model). A niche genetic algorithm
(nGA) is an implementation where several instances
of a genetic algorithm are run in parallel, evolving
sub-populations independently from each other (the
different niches). At certain points during evolution
individuals migrate to other niches and become part
of the population of that niche. The major advan-
tage of nGAs is that they not only allow evolvement
of multiple solutions at the same time, they exploit
the fact that different runs of the same genetic algo-
rithm is likely to produce different suboptimal solu-
tions, that combined are likely to yield better results.
Like PT the advantage of nGA is expected to be more
profound when the fitness landscape is very rugged.
PT and nGA, with N replicas and niches re-
spectively, essentially requires N times more com-
putational time than a single run of their sequential
counterparts. However, with multi-core computers
and CPU clusters being readily available to most re-
searchers, they can be executed in parallel and the ex-
tra computational time required does thus not impose
a problem. On top of that, PT and nGA generally
search more efficiently and usually arrive at much bet-
ter results, which makes the parallelization of these
meta-heuristics an attractive feature indeed.
In this paper we propose an iterative variant of a
nGA, called inGA (iterative niche genetic algorithm)
for protein structure prediction. The algorithm is de-
signed to increase search efficiency by locating and
converging on the low energy structures faster than
both nGA and PT. Essentially, the strategy corre-
sponds to letting all niches converge before migrating
individuals between them and restarting as described
by Cantu-Paz and Goldberg (Cant-Paz and Goldberg,
1996). However, while running each niche to conver-
gence worked well for the problem instances chosen
by Cantu-Paz and Goldberg, work by Heiler (Heiler,
1998) suggest that for protein structure prediction the
quality of predicted structures decrease when the in-
dividuals are locally optimized before the genetic op-
erators are applied. Rather than running to conver-
gence we thus suggest a kind of early stopping, which
generate low energy structures without spending too
much time on refining suboptimal structures.
2 METHODS
In the traditional niche genetic algorithm (nGA),
evolvement of several populations are run in parallel
and completely independent from each other. At cer-
tain points individuals from one or more niches (is-
lands or demes) are chosen according to some selec-
tion strategy and migrated to other niches, where they
replace individuals also chosen according to some se-
lection strategy. Usually the selection strategies are
based on the fitness values of the individuals such that
the best individuals from one niche are migrated to
another niche where they replace the worst individu-
als, as this migration strategy yields the fastest con-
vergence (Alba, 2005).
We propose an iterative niche genetic algorithm
(inGA) that performs a type of elitist refinement. Like
the traditional niche genetic algorithm multiple pop-
ulations are initially created and evolved in parallel,
but unlike the traditional niche algorithms, individu-
als do not migrate to other niches. Rather we stop
evolvement of all populations after a predefined num-
ber of generations, g, and choose the best solution
from each of the n niches. The individuals not se-
lected are destroyed while the selected individual are
put together in a new population, pop. pop is then
cloned n times and the cloned populations are placed
on the n niches where evolvement of new (and ini-
tially identical) populations is then carried out for g
IMPROVING SEARCH FOR LOW ENERGY PROTEIN STRUCTURES WITH AN ITERATIVE NICHE GENETIC
ALGORITHM
227