BENEFITS OF GENETIC ALGORITHM FEATURE-BASED RESAMPLING FOR PROTEIN STRUCTURE PREDICTION

Trent Higgs, Bela Stantic, Tamjidul Hoque, Abdul Sattar

2012

Abstract

Protein structure prediction (PSP) is an important task as the three-dimensional structure of a protein dictates what function it performs. PSP can be modelled on computers by searching for the global free energy minimum based on Afinsen’s ‘Thermodynamic Hypothesis’. To explore this free energy landscape Monte Carlo (MC) based search algorithms have been heavily utilised in the literature. However, evolutionary search approaches, like Genetic Algorithms (GA), have shown a lot of potential in low-resolution models to produce more accurate predictions. In this paper we have evaluated a GA feature-based resampling approach, which uses a heavy-atom based model, by selecting 17 random CASP 8 sequences and evaluating it against two different MC approaches. Our results indicate that our GA improves both its root mean square deviation (RMSD) and template modelling score (TM-Score). From our analysis we can conclude that by combining feature-based resampling with Genetic Algorithms we can create structures with more native-like features due to the use of crossover and mutation operators, which is supported by the low RMSD values we obtained.

References

  1. Arunachalam, J., Kanagasabai, V., and Gautham, N. (2006). Protein structure prediction using mutually orthogonal latin squares and a genetic algorithm. Biochemical and Biophysical Research Communications, 342:424- 433.
  2. Baker, D. (2006). Prediction and design of macromolecular structures and interactions. Philosphical Transactions of the Royal Society B, 361:459-463.
  3. Blum, B. (2008). Resampling Methods for Protein Structure Prediction. PhD thesis, Electrical Engineering and Computer Sciences University of California at Berkeley.
  4. Bornberg-Bauer, E. (1997). Chain growth algorithms for HP-type lattice proteins. In Research in Computational Molecular Biology RECOMB, pages 47-55.
  5. Bradley, P., Chivian, D., Meiler, J., Misura, K., Rohl, C., Schief, W., Wedemeyer, W., Scueler-Furman, O., Murphy, P., Schonbrun, J., Strauss, C., and Baker, D. (2003). Rosetta predictions in CASP5: Success, failure, and prospects for complete automation. PROTEINS: Structure, Function, and Genetics, 53:457- 468.
  6. Brunette, T. and Brock, O. (2005). Improving protein structure prediction with model-based search. Bioinformatics, 21 (Suppl. 1):i66-i74.
  7. Higgs, T., Stantic, B., Hoque, T., and Sattar, A. (2010). Genetic algorithm feature-based resampling for protein structure prediction. In IEEE World Congress on Computational Intelligence, pages 2665-2672.
  8. Hoque, T., Chetty, M., and Sattar, A. (2007). Protein folding prediction in 3D FCC HP lattice model using genetic algorithm. In IEEE Congress on Evolutionary Computation, pages 4138-4145.
  9. Hoque, T., Chetty, M., and Sattar, A. (2009). Extended HP model for protein structure prediction. Journal of Computational Biology, 16:85-103.
  10. Jiang, T., Cui, Q., Shi, G., and Ma, S. (2003). Protein folding simulations of hydrophobic-hydrophilic model by combining tabu search with genetic algorithms. Journal of Chemical Physics, 119(8):4592-4596.
  11. Kryshtafovych, A., Krzysztof, F., and Moult, J. (2009). CASP8 results in context of previous experiments. Proteins: Structure, Function, and Bioinformatics, 77(S9):217-228.
  12. Metropolis, N. and Ulam, S. (1949). The monte carlo method. Journal of the American Statistical Association, 44:335-341.
  13. Pedersen, J. and Moult, J. (1997). Protein folding simulations with genetic algorithms and a detailed molecular description. Journal of Molecular Biology, 269:240- 259.
  14. Sali, A. and Blundell, T. (1993). Comparative protein modelling by satisfaction of spatial restraints. Journal of Molecular Biology, 234(3):779-815.
  15. Sayle, R. (2009). Molecular visualization freeware and rasmol classic site. http://www.umass.edu/microbio/ rasmol/index2.htm.
  16. Shmygelska, A. and Hoos, H. (2005). An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinformatics, 6(30).
  17. Simons, K. and et al. (2001). Prospects for ab initio protein structural genomics. Journal of Moleculer Biology, 306:1191-1199.
  18. Unger, R. and Moult, J. (1993). Genetic algorithms for 3D protein folding simulations. Journal of Molecular Biology, 231:75-81.
  19. Zhang, Y. (2007). Template-based modeling and free modeling by I-TASSER in CASP7. Proteins, 8:108-117.
  20. Zhang, Y. and Skolnick, J. (2004a). Automated structure prediction of weakly homologous proteins on a genomic scale. PNAS, 101(20):7594-7599.
  21. Zhang, Y. and Skolnick, J. (2004b). Scoring function for automated assessment of protein structure template quality. PROTEINS: Structure, Function, and Bioinformatics, 57:702-710.
Download


Paper Citation


in Harvard Style

Higgs T., Stantic B., Hoque T. and Sattar A. (2012). BENEFITS OF GENETIC ALGORITHM FEATURE-BASED RESAMPLING FOR PROTEIN STRUCTURE PREDICTION . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012) ISBN 978-989-8425-90-4, pages 188-194. DOI: 10.5220/0003770801880194


in Bibtex Style

@conference{bioinformatics12,
author={Trent Higgs and Bela Stantic and Tamjidul Hoque and Abdul Sattar},
title={BENEFITS OF GENETIC ALGORITHM FEATURE-BASED RESAMPLING FOR PROTEIN STRUCTURE PREDICTION},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)},
year={2012},
pages={188-194},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003770801880194},
isbn={978-989-8425-90-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)
TI - BENEFITS OF GENETIC ALGORITHM FEATURE-BASED RESAMPLING FOR PROTEIN STRUCTURE PREDICTION
SN - 978-989-8425-90-4
AU - Higgs T.
AU - Stantic B.
AU - Hoque T.
AU - Sattar A.
PY - 2012
SP - 188
EP - 194
DO - 10.5220/0003770801880194