erations (since the population at the second ge-
netic iteration is obtained from the population in
the first iteration).
• In terms of comparison with the scatter search
metaheuristic procedure, suitable results are ob-
tained and in most cases, these are statistically
equivalent to the results of the SGA/BFGS meth-
ods.
• Another interesting aspect is the effect of the hy-
bridization strategy on SGA/BFGS methods. In
most cases, there is no statistical relevance in ap-
plying both techniques in this hybridization. By
way of partial conclusion, we may expect the eli-
tist factor of the SGA/BFGS algorithm to be more
important for making this hybridization than the
hybridization strategy itself.
• All the methods that include the BFGS procedure
perform better than a stationary genetic algorithm.
Therefore, the main conclusion that we may ob-
tain from this fact is that it is preferable to use a
Quasi-Newton method if it is possible to compute
the gradient, instead of using a heuristic procedure
such as SGA.
5 CONCLUSIONS
This work has presented various aspects of using hy-
brid training algorithms for recurrent neural networks.
We have studied the behavior of certain hybrid meth-
ods proposed experimentally in a concrete time se-
ries prediction problem: the CATS benchmark. Our
conclusion is that a Baldwinian hybridization strategy
is generally preferable since it does not alter popu-
lation diversity (this is exemplified by the results of
the CHC/BFGS method). In certain cases, however,
such as in a stationary genetic hybridization, the eli-
tist properties of the model are more important than
the hybridization strategy itself. It is also important
to consider the effects of the improvement method
against those from genetic operators since excessive
use of the local search operator may negatively af-
fect the genetic selection, recombination and muta-
tion processes (this is the case of MGA/BFGS and
GGA/BFGS).
In general, the results suggest that a hybrid
method may improve the performance in the training
of a recurrent neural network for time series predic-
tion problems. However, abuse of the local search
operator may produce statistical equivalence in the
results against the multi-start procedure. In order to
avoid this, special attention should be paid to the de-
sign of the optimization of the hybrid algorithm pa-
rameters.
REFERENCES
Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning
long-term dependencies with gradient descent is diffi-
cult. IEEE Trans. on Neural Networks, 5(2):157–166.
Blanco, A., Delgado, M., and Pegalajar, M. C. (2001).
A real-coded genetic algorithm for training recurrent
neural networks. Neural Networks, 14:93–105.
Byrd, R., Lu, P., and Nocedal, J. (1995). A limited
memory algorithm for bound constrained optimize-
tion. SIAM Journal on Scientific and Statistical Com-
puting, 16(5):1190–1208.
Cu
´
ellar, M., Delgado, M., and Pegalajar, M. (2005). An ap-
plication of non-linear programming to train recurrent
neural networks in time series prediction problems. In
Proc. of International Conference on Enterprise and
Information Systems (ICEIS’05), pages 35–42, Miami
(USA).
Cu
´
ellar, M., Delgado, M., and Pegalajar, M. (2006).
Memetic evolutionary training for recurrent neural
networks: an application to time-series prediction. Ex-
pert Systems, 23(2):99–117.
Elman, J. (1990). Finding structure in time. Cognitive Sci-
ence, 14:179–211.
Haykin, S. (1999). Neural Networks: A Comprehensive
Foundation. Prentice Hall.
Ku, K. and Mak, M. (1997). Exploring the effects of lamar-
ckian and baldwinian learning in evolving recurrent
neural networks. In Proc. of the IEEE International
Conference on Evolutionary Computation.
Laguna, M. and Mart
´
ı, R. (2003). Scatter Search. Method-
ology and Implementations in C. Kluwer Academic
Publishers.
Lendasse, A., Oja, E., Simula, O., and Verleysen, M.
(2004). Time series competition: The cats bench-
mark. In Proc. International Joint Conference on
Neural Networks (IJCNN’04), pages 1615–1620, Bu-
dapest (Hungary).
Mandic, D. and Chambers, J. (2001). Recurrent Neural Net-
works for Prediction. John Wiley and sons.
Moscato, P. and Porras, C. C. (2003). An introduction to
memetic algorithms. Inteligencia Artificial (Special
Issue on Metaheuristics), 2(19):131–148.
Prudencio, R. and Ludermir, T. (2003). Neural network
hybrid learning: Genetic algorithms and levenberg-
marquardt. In Proc. 26th Annual Conference of the
Gesellschaft fr Classifikation, pages 464–472.
Zhu, C., Byrd, R., and Nocedal, J. (1997). L-bfgs-b. algo-
ritmo 778: L-bfgs-b, fortran routines for large scale
bound constrained optimization. ACM Transactions
on Mathematical Software, 23(4):550–560.
ICEIS 2007 - International Conference on Enterprise Information Systems
210