# STUDY OF PROTEIN STRUCTURE ALIGNMENT PROBLEM IN PARAMETERIZED COMPUTATION

### Cody Ashby, Kun Wang, Carole L. Cramer, Xiuzhen Huang

#### Abstract

Motivated by the practical application of protein structure-structure alignment, we have studied the problem of maximum common subgraph within the framework of parameterized complexity. We investigated the lower bound for the exact algorithms of the problem. We proved it is unlikely that there is an algorithm of time p(n,m) ∗ ko(m) for the problem, where p is a polynomial function, k is a parameter of map width, and m and n are the numbers of vertices of the two graphs respectively. In consideration of the upper bound of p(n,m)∗km based on the brute-force approach, our lower bound result is asymptotically tight. Although the algorithm with the running time p(n,m) ∗ km could not be significantly improved from our lower bound result, it is still possible to develop efficient algorithms for the practical application of the protein structure-structure alignment. We developed an efficient algorithm integrating the color coding method and parameterized computation for identifying the maximum common subgraph of two protein structure graphs. We have applied the algorithm to protein structure-structure alignment and conducted experimental testing of more than 600 protein pairs. Our parameterized approach shows improvement in structure alignment efficiency and will be very useful for structure comparisons of proteins with large sizes.

#### References

- N. Alon, R. Yuster, and U. Zwick. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In STOC, 326- 335, 1994.
- A. Caprara and G. Lancia. Structural alignment of largesize proteins via Lagrangian relaxation. RECOMB 2002, 100-108, 2002.
- M. Comin, C.Guerra, and G. Zanotti. PROuST: a comparison method of three-dimensional structures of proteins using indexing techniques. J. of Comp. Biology, 11(6):1061-1072, 2004.
- J. Chen, X. Huang, I. A. Kanj, and G. Xia. Strong computational lower bounds via parameterized complexity. JCSS, 72(8):1346-1367, 2006.
- W. DeLano. The pymol user's manual. DeLano Scientific, San Carlos, CA, 2002.
- R. G. Downey, V. Estivill-Castro, M. R. Fellows, E. Prieto, and F. A. Rosamond. Cutting up is hard to do: the parameterized complexity of k-cut and related problems. Electr. Notes Theor. Comput. Sci., 78, 2003.
- R. G. Downey and M. R. Fellows. Parameterized Complexity. Springer, 1999.
- A. Fiser and A. Sali. Modeller: generation and refinement of homology-based protein structure models. Methods in Enzymology, 374:461-491, 2003.
- L. Holm and C. Sander. Protein structure comparison by alignment of distance matrices. J. of Molecular Biology, 233:123-138, 1993.
- X. Huang and J. Lai. Maximum Common Subgraph: Upper Bound and Lower Bound Results. IMSCCS, 1:40-47, 2006.
- R. Impagliazzo, R. Paturi, and F. Zane. Which problems have strongly exponential complexity? JCSS, 63(4):512-530, 2001.
- W. Kabsch and C. Sander. Dssp: definition of secondary structure of proteins given a set of 3d coordinates. Biopolymers, 22:2577-2637, 1983.
- A. Konagurthu, J. Whisstock, P. Stuckey, and A. Lesk. MUSTANG: a multiple structural alignment algorithm. Proteins: Structure, Function, and Bioinformatics, 64(3):559-574, 2006.
- G. Lancia and S. Istrail. Protein structure comparison: Algorithms and applications. Mathematical Methods for Protein Structure Analysis and Design, LNCS 2666, 1-33, 2003.
- C. Lemmen and T. Lengauer. Computational methods for the structural alignment of molecules. J. of ComputerAided Molecular Design, 14:215-232, 2000.
- E. Lindahl and A. Elofsson. Identification of related proteins on family, superfamily and fold level. J. of Molecular Biology, 295(3):613-625, 2000.
- C. H. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes. JCSS, 43(3):425-440, 1991.
- A. Porollo, R. Adamczak, and J. Meller. Polyview: a flexible visualization tool for structural and functional annotations of proteins. Bioinformatics, 20(15):2460, 2004.
- Y. Song, C. Liu, R. L. Malmberg, C. He, and L. Cai. Memory efficient alignment between rna sequences and stochastic grammar models of pseudoknots. Intl. journal of bioinf. research&applications, 2(3):289-304, 2006.
- J. Xu, F. Jiao, and B. Berger. A parameterized algorithm for protein structure alignment. J. of Computational Biology 14: 564-577, 2007.
- Y. Xu, Z. Liu, L. Cai, and D. Xu. Protein structure prediction by protein threading. in Comp. Methods for Protein Structure Prediction and Modeling, Vols I&II, (eds. Xu, Y., Xu, D., and Liang, J.), 389-430, Springer, 2006.
- Y. Zhang, and J. Skolnick. The protein structure prediction problem could be solved using the current PDB library. Proc. of the National Academy of Sciences, 102, 4, pp. 1029-1034, 2005.
- Y. Zhang, and J. Skolnick. TM-align: A protein structure alignment algorithm based on TM-score. Nucleic Acids Research, 33: 2302-2309, 2005.
- J. Zhu, and Z. Weng. FAST: a novel protein structure alignment algorithm. Proteins 58:618-627, 2005.

#### Paper Citation

#### in Harvard Style

Ashby C., Wang K., L. Cramer C. and Huang X. (2012). **STUDY OF PROTEIN STRUCTURE ALIGNMENT PROBLEM IN PARAMETERIZED COMPUTATION** . In *Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)* ISBN 978-989-8425-90-4, pages 174-181. DOI: 10.5220/0003769701740181

#### in Bibtex Style

@conference{bioinformatics12,

author={Cody Ashby and Kun Wang and Carole L. Cramer and Xiuzhen Huang},

title={STUDY OF PROTEIN STRUCTURE ALIGNMENT PROBLEM IN PARAMETERIZED COMPUTATION},

booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)},

year={2012},

pages={174-181},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0003769701740181},

isbn={978-989-8425-90-4},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2012)

TI - STUDY OF PROTEIN STRUCTURE ALIGNMENT PROBLEM IN PARAMETERIZED COMPUTATION

SN - 978-989-8425-90-4

AU - Ashby C.

AU - Wang K.

AU - L. Cramer C.

AU - Huang X.

PY - 2012

SP - 174

EP - 181

DO - 10.5220/0003769701740181