Constructing Structural Profiles for Protein Torsion Angle Prediction

Zafer Aydin, David Baker, William Stafford Noble

Abstract

Structural frequency profiles provide important constraints on structural aspects of a protein and is receiving a growing interest in the structure prediction community. In this paper, we introduce new techniques for scoring templates that are later combined to form structural profiles of 7-state torsion angles. By employing various parameters of target-template alignments we improve the quality and accuracy of structural profiles considerably. The most effective technique is the scaling of templates by integer powers of sequence identity score in which the power parameter is adjusted with respect to the similarity interval of the target. Incorporating other alignment scores as multiplicative factors further improves the accuracy of profiles. After analyzing the individual strengths of various structural profile methods, we combine them with ab-initio predictions of 7-state torsion angles by a linear committee approach. We show that incorporating template information improves the accuracy of ab-initio predictions significantly at all levels of target-template similarity even when templates are distant from the target. Template scaling methods developed in this work can be applied in many other prediction tasks and in more advanced methods designed for computing structural profiles.

References

  1. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. 25:3389-3402.
  2. Aydin, Z., Singh, A., Bilmes, J., and Noble, W. S. (2011). Learning sparse models for a dynamic Bayesian network classifier of protein secondary structure. BMC Bioinformatics, 12:154.
  3. Aydin, Z., Thompson, J., Bilmes, J., Baker, D., and Noble, W. S. (2012). Protein torsion angle class prediction by a hybrid architecture of bayesian and neural networks. In 13th International Conference on Bioinformatics and Computational Biology.
  4. Berjanskii, M. V., Neal, S., and Wishart, D. S. (2006). PREDITOR: a web server for predicting protein torsion angle restraints. Nucleic Acids Research, 34(Web Server Issue):W:63-69.
  5. Blum, B., Jordan, M., Kim, D., Das, R., Bradley, P., and Baker, D. (2008). Feature selection methods for improving protein structure prediction with Rosetta. In Platt, J., Koller, D., Singer, Y., and Roweis, S., editors, Advances in Neural Information Processing Systems 20, pages 137-144. MIT Press, Cambridge, MA.
  6. Cheng, J., Tegge, A. N., and Baldi, P. (2008). Machine learning methods for protein structure prediction. IEEE Reviews in Biomedical Engineering, 1:41- 49.
  7. Cong, P., Li, D., Wang, Z., Tang, S., and Li, T. (2013). Spssm8: An accurate approach for predicting eightstate secondary structures of proteins. Biochimie, 95(12):2460-2464.
  8. Faraggi, E., Zhang, T., Yang, Y., Kurgan, L., and Zhou, Y. (2012). SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. PLoS One, 7(2):e30361.
  9. Henikoff, S. and Henikoff, J. G. (1994). Position-based sequence weights. 243:574-578.
  10. Hobohm, U. and Sander, C. (1994). Enlarged representative set of protein structures. Protein Science, 3:522-524.
  11. Jones, D. T. (1999). Protein secondary structure prediction based on position-specific scoring matrices. 292:195- 202.
  12. Li, D., Li, T., Cong, P., Xong, W., and Sun, J. (2012). A novel structural position-specific scoring matrix for the prediction of protein secondary structures. Bioinformatics, 28(1):32-39.
  13. Mooney, C. and Pollastri, G. (2009). Beyond the twilight zone: Automated prediction of structural properties of proteins by recursive neural networks and remote homology information. Proteins: Structure, Function, and Bioinformatics, 77:181-190.
  14. Pollastri, G., Martin, A. J. M., Mooney, C., and Vullo, A. (2007). Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information. BMC Bioinformatics, 8(201).
  15. Remmert, M., Biegert, A., Hauser, A., and Soding, J. (2011). Hhblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment. Nature Methods, 9(2):173-175.
  16. Shen, Y., Delaglio, F., Cornilescu, G., and Bax, A. (2009). TALOS+: a hybrid method for predicting protein backbone torsion angles from nmr chemical shifts. Journal of Biomolecular NMR, 44(4):213-223.
  17. Singh, H., Singh, S., and Raghava, G. P. S. (2014). Evaluation of protein dihedral angle prediction methods. PLoS One, 9(8):e105667.
  18. Soding, J. (2005). Protein homology detection by HMMHMM comparison. Bioinformatics, 21:951-960.
  19. Soding, J. (2006). Quick guide to HHsearch. ftp://toolkit.genzentrum.lmu.de/pub/HHsearch/old/ HHsearch/HHsearch1.5.1/HHsearch-guide.pdf.
  20. Soding, J., Remmert, M., and Hauser, A. (2012). HH-suite for sensitive sequence searching based on hmm-hmm alignment. ftp://toolkit.genzentrum.lmu.de/pub/HHsuite/hhsuite-userguide.pdf.
  21. Song, J., Tan, H., Wang, M., Webb, G. I., and Akutsu, T. (2012). TANGLE: two-level support vector regression approach for protein backbone torsion angle prediction from primary sequences. PLoS One, 7(2):e30361.
  22. Sun, J., Tang, S., Xiong, W., Cong, P., and Li, T. (2012). Dsp: a protein shape string and its profile prediction server. Nucleic Acids Research, 40(W1):W298- W302.
  23. Walsh, I., Bau, D., Martin, A. J. M., Mooney, C., Vullo, A., and Pollastri, G. (2009). Ab initio and templatebased prediction of multi-class distance maps by twodimensional recursive neural networks. BMC Structural Biology, 9(5).
  24. Wang, G. and Dunbrack, Jr., R. L. (2003). PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591. Web server at http://dunbrack.fccc.edu/PISCES.php.
  25. Wang, G. and Dunbrack, Jr., R. L. (2005). PISCES: recent improvements to a pdb sequence culling server. Nucleic Acids Res., 33:W94-W98. Web server at http://dunbrack.fccc.edu/PISCES.php.
  26. Wu, S. and Zhang, Y. (2008a). ANGLOR: A composite machine-learning algorithm for protein backbone torsion angle prediction. PLoS One, 3(10):e3400.
  27. Wu, S. and Zhang, Y. (2008b). MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins: Structure, Function, and Bioinformatics, 72(2):547-556.
  28. Zemla, A., Venclovas, C., Fidelis, K., and Rost, B. (1999). A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins, 34:220-223.
Download


Paper Citation


in Harvard Style

Aydin Z., Baker D. and Noble W. (2015). Constructing Structural Profiles for Protein Torsion Angle Prediction . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2015) ISBN 978-989-758-070-3, pages 26-35. DOI: 10.5220/0005208500260035


in Bibtex Style

@conference{bioinformatics15,
author={Zafer Aydin and David Baker and William Stafford Noble},
title={Constructing Structural Profiles for Protein Torsion Angle Prediction},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2015)},
year={2015},
pages={26-35},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005208500260035},
isbn={978-989-758-070-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2015)
TI - Constructing Structural Profiles for Protein Torsion Angle Prediction
SN - 978-989-758-070-3
AU - Aydin Z.
AU - Baker D.
AU - Noble W.
PY - 2015
SP - 26
EP - 35
DO - 10.5220/0005208500260035