another one presenting a similar functionality.
Furthermore, it was possible to divide some families
into two subgroups. This indicates that, despite the
fact that some structures belong to the same family,
their appearance are dissimilar enough to be grouped
in two distinct sub-families. In addition, an
exhaustive search and the ranking of proteins against
the entire Protein Data Bank may be performed in
under a second, as opposed to the approaches of
others, as introduced in Section 1. In these
algorithms, the search is either performed using a
very small subset, typically a few hundred.
Alternatively, these methods employ a non-
exhaustive search and rely on heuristic assumptions.
In comparison, our method performs a search on the
entire Protein Data Bank database without any such
a priori assumptions or query size constraints.
In future, we plan to conduct a robust empirical
comparative study contrasting our system with other
approaches in the field. We are also interested in
investigating whether using other similarity
measures will have an impact on our results [CS04]
and addressing the automatic classification of very
large databases of protein structures.
REFERENCES
Abeysinghe, S., Tao, J., Baker, M. L., Wah, C. (2008).
Shape Modeling and Matching in Identifying 3D
Protein Structures. Computer Aided Design, 40 (6),
708-720.
Akbar, S., Kung, J. and Wagner, R. (2006). Exploiting
Geometrical Proper-ties of Protein Similarity Search.
Proceeding of the 17th International Conference on
Database and Expert Systems Applications
(DEXA’06), Krakow, Poland, 228-234.
Andreeva A., Howorth D., Chandonia J.-M., Brenner S.E.,
Hubbard T.J.P., Chothia C., Murzin A.G. (2008). Data
growth and its impact on the SCOP database: new
developments. Nucl. Acid Res. 36, D419-D425.
Berman, H.M. et al. (2000). The Protein Data Bank.
Nucleic Acids Research, 28, 235-242.
Berman, H.M. et al. (2008). The Protein Data Bank.
http://www.wwpdb.org.
Chenyang, C., Zhen, L. (2008). Classification of 3D
Protein based on Structure Information Feature.
International Conference on Biomedical Engineering
and Informatics (BMEI 2008), Sanya, China, 98-101.
Chi, P.H., Scott, G., Shyu, C.-R. (2004). A Fast Protein
Structure Sys-tem Using Image-Based Distance
Matrices and Multidimensional Index. Proceeding of
the Fourth IEEE Symposium on Bioinformatics and
Bioengineering (BIBE’04), Taichung, Taiwan, 522-
532.
Cui, C., Shi, J. (2004). Automatic retrieval of 3D Protein
Structures based on Shape Similarity. SPIE: Storage
and Retrieval Methods and Application for
Multimedia, 5397, 543-549.
Daras, P. et. al. (2006). Three-dimensional shape-
structure comparison method for protein classification.
IEEE/ACM Transactions on Computational Biology
and Bioinformatics, 3(3), 193-207.
Huang, z. et. al. (2006). 3D Protein Structure Matching by
Patch Signatures. DEXA 2006, LNCS 4080,
Springer-Verlag, Berlin, 528-537.
Lancia, G., Istrail, S. (2003). Mathematical Methods for
Protein Structure Analysis and Design. C.I.M.E
Summer School Advanced Lectures, Protein Structure
Comparison: Algorithms and Applications, LNBI
2666, Springer-Verlag, Berlin, 1-33.
Paquet, E., Viktor, H.L. (2007). CAPRI- Content-based
Analysis of Protein Structure for Retrieval and
Indexing, VLDB 2007 Workshop on Bioinformatics,
Vienna: Austria, VLDB Press, 10 pp.
Paquet, E., Viktor, H.L. (2007). Discovering Protein
Families using Invariant 3D Shape-based Signatures.
29
th
Annual International Conference of the IEEE
Engineering in Medicine and Biology Society (ECBS
2006), Lyon, France, 1204-1208.
Paquet, E., Viktor, H. L. (2008). CAPRI/MR: Exploring
Protein Databases from a Structural and
Physicochemical Point of View. 34th International
Conference on Very Large Data Bases (VLDB 2008),
Auckland, New Zealand, 1504-1507.
Ohkawa, T., Nonomura, Y., Inoue, K. (2004). Logical
Cluster Construction in a Grid Environment for
Similar Protein Retrieval. Proceeding of the 2004
International Symposium on Applications and the
Internet Workshops (SAINTW’04), Tokyo, Japan, 5-
16.
Park, S.-H., Park, S.-J., Park, S.H. (2005). A Protein
Structure Retrieval System Using 3D Edge Histogram,
Key Engineering Materials. 277-279, 324-330.
Yeh, J.-S., Chen, D.-Y., Ouhyoung, M. (2005). A Web-
based Protein Retrieval System by Matching Visual
Similarity. Bioinformatics, 21 (13), 3056-3057.
Ying, Z.; Kaixing, Z., Yuankui, M. (2008). 3D Protein
Structure Similarity Comparison using a Shape
Distribution Method. 5th International Conference on
Information Technology and Applications in
Biomedicine in conjunction with 2nd International
Symposium & Summer School on Biomedical and
Health Engineering, Shenzhen, China, 233-236.
Zaki, M. J., Bystroff (2008). Protein Structure Prediction.
Totowa, NJ: Humana Press.
FINDING PROTEIN FAMILY SIMILARITIES IN REAL TIME THROUGH MULTIPLE 3D AND 2D
REPRESENTATIONS, INDEXING AND EXHAUSTIVE SEARCHING
133