Exploring a Sub-optimal Hidden Markov Model Sampling Approach for De Novo Peptide Structure Modeling

Pierre Thevenet, Pierre Tufféry

Abstract

Peptides have, in the recent years, become plausible candidate therapeutics. However, their structural characterization at a large scale, necessary for their identification and optimization, still remains an open in silico challenge. We introduce a new procedure to the rapid generation of 3D models of peptides. It is based on the concept of Hidden Markov Model derived structural alphabet, a generalization of the secondary structure. Based on this concept we have previously setup an approach to the de novo modeling of peptide structure based on a greedy algorithm. Here, we explore a new strategy that relies on the sampling of the sub-optimal sequences of states in the terms of a Hidden Markov Model derived structural alphabet. Our results suggest such procedure is able to identify the native conformation of peptides at a very low algorithmic complexity, while having a performance similar to the former greedy approach. On average peptide models approximate the experimental structure at less than 3°A RMSD, for a processing cost of only few minutes on a workstation. As a result, peptide de novo modeling becomes tractable at a large scale.

References

  1. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 25(17):3389-402.
  2. Camproux, A., Gautier, R., and Tufféry, P. (2004). A hidden Markov model derived structural alphabet for proteins. J. Mol. Biol., 339:591-605.
  3. Escoubas, P. and King, G. F. (2009). Venomics as a drug discovery platform. Expert Review of Proteomics, 6(3):221-224.
  4. Etchebest, C., Benros, C., Hazout, S., and de Brevern, A. G. (2005). A structural alphabet for local protein structures: Improved prediction methods. Proteins: Structure, Function, and Bioinformatics, 59(4):810-827.
  5. Foreman, L. (1992). Generalisation of the Viterbi algorithm. IMA J. Management Math., 4:351-367.
  6. Gellman, S. and Woolfson, D. (2002). Mini-proteins trp the light fantastic. Nature Structal Biology, 9:408-410.
  7. Hobbs, E., Fontaine, F., Yin, X., and Storz, G. (2011). An expanding universe of small proteins. Current Opinion in Microbiology, 14:167-173.
  8. Jones, D. (1999). Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol., 292:195-202.
  9. Kastenmayer, J. P., Ni, L., Chu, A., Kitchen, L. E., Au, W.- C., Yang, H., Carter, C. D., Wheeler, D., Davis, R. W., Boeke, J. D., Snyder, M. A., and Basrai, M. A. (2006). Functional genomics of genes with small open reading frames (sORFs) in S. cerevisiae. Genome Research, 16(3):365-373.
  10. Kaur, H., Garg, A., and Raghava, G. (2007). PEPstr: A de novo method for tertiary structure prediction of small bioactive peptides. Protein Pept Lett., 14:626-630.
  11. Li, Y. and Zhang, Y. (2009). REMO: A new protocol to refine full atomic protein models from C-alpha traces by optimizing hydrogen-bonding networks. Proteins: Structure, Function, and Bioinformatics, 76(3):665- 676.
  12. Lins, L., Charloteaux, B., Heinen, C., Thomas, A., and Brasseur, R. (2006). “De Novo” Design of Peptides with Specific Lipid-Binding Properties. Biophysical Journal, 90(2):470-479.
  13. Malavolta, L. and Cabral, F. (2011). Peptides: important tools for the treatment of central nervous system disorders. Neutopeptides, 45:309-316.
  14. Maupetit, J., Derreumaux, P., and Tufféry, P. (2009). PEPFOLD: an online resource for de novo peptide structure prediction. Journal of Computational Chemistry, 31:726-738.
  15. Maupetit, J., Derreumaux, P., and Tufféry, P. (2010). A fast method for large-scale de novo peptide and miniprotein structure prediction. J Comput Chem, 31(4):726- 38.
  16. Maupetit, J., Tufféry, P., and Derreumaux, P. (2007). A coarse-grained protein force field for folding and structure prediction. Proteins, 69:394-408.
  17. Park, B. and Levitt, M. (1995). The complexity and accuracy of discrete state models of protein structure. J. Mol. Biol., 249:493-507.
  18. Rabiner, L. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77:257-286.
  19. Rohl, C., Strauss, C., Misura, K., and Baker, D. (2004). Protein structure prediction using Rosetta. Methods in Enzymology, 383:66-69.
  20. Suzek, B., Huang, H., McGarvey, P., Mazumder, R., and Wu, C. (2007). UniRef: Comprehensive and NonRedundant UniProt Reference Clusters. Bioinformatics, 23:1282-1288.
  21. Thévenet, P., Shen, Y., Maupetit, J., Guyon, F., Derreumaux, P., and Tufféry, P. (2012). PEP-FOLD: an updated de novo structure prediction server for both linear and disulfide bonded cyclic peptides. Nucleic Acids Res., 40(Web Server issue):W288-93.
  22. Thomas, A., Deshayes, S., Decaffmeyer, M., Eyck, M. V., Charloteaux, B., and Brasseur, R. (2009). PepLook: an innovative in silico tool for determination of structure, polymorphism and stability of peptides. Adv Exp Med Biol., 611:459-460.
  23. Thomas, A., Deshayes, S., Decaffmeyer, M., Van Eyck, M. H., Charloteaux, B., and Brasseur, R. (2006). Prediction of peptide structure: How far are we? Proteins: Structure, Function, and Bioinformatics, 65(4):889-897.
  24. Tuffery, P., Guyon, F., and Derreumaux, P. (2005). Improved greedy algorithm for protein structure reconstruction. J Comput Chem, 26(5):506-13.
  25. Vetter, I., Davis, J., L.D.Rash, Anangi, R., Mobli, M., Alewood, P., Lewis, R., and King, G. (2011). Venomics: a new paradigm for natural products-based drug discovery. Amino Acids, 40:15-28.
  26. Viterbi, A. (1967). Error bounds for convolutional codes. IEEE Trans. Inform. Theory, 13:260-269.
  27. Vlieghe, P., Lisowski, V., Martinez, J., and Khrestchatisky, M. (2010). Systhetic therapeutic peptides: science and market. Drug Discovery Today, 15:40-56.
  28. Warren, A. S., Archuleta, J. S., chun Feng, W., and Setubal, J. C. (2010). Missing genes in the annotation of prokaryotic genomes. BMC Bioinformatics, 11:131.
  29. Wu, S. and Zhang, Y. (2007). LOMETS: A local metathreading-server for protein structure prediction. Nucleic Acids Research, 35(10):3375-3382.
  30. Zhang, Y. (2008). I-TASSER server for protein 3D structure prediction. BMC Bioinformatics, 9(1):40.
  31. Zhang, Y. and Skolnick, J. (2004). SPICKER: A clustering approach to identify near-native protein folds. Journal of Computational Chemistry, 25(6):865-871.
Download


Paper Citation


in Harvard Style

Thevenet P. and Tufféry P. (2014). Exploring a Sub-optimal Hidden Markov Model Sampling Approach for De Novo Peptide Structure Modeling . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014) ISBN 978-989-758-012-3, pages 24-30. DOI: 10.5220/0004750000240030


in Bibtex Style

@conference{bioinformatics14,
author={Pierre Thevenet and Pierre Tufféry},
title={Exploring a Sub-optimal Hidden Markov Model Sampling Approach for De Novo Peptide Structure Modeling},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)},
year={2014},
pages={24-30},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004750000240030},
isbn={978-989-758-012-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)
TI - Exploring a Sub-optimal Hidden Markov Model Sampling Approach for De Novo Peptide Structure Modeling
SN - 978-989-758-012-3
AU - Thevenet P.
AU - Tufféry P.
PY - 2014
SP - 24
EP - 30
DO - 10.5220/0004750000240030