Noisy torsion angle Real static angle
Real torsion angle Noisy static angle
Real protein Noisy protein Remade protein
X-Ray
Torsion
Angles
extraction
PDB file
Torsion angles
3.24
2.45
56.13
-120.56
-75.25
170.10
17.90
…
Remaking
a) b) c) d)
Noisy torsion angle Real static angle
Real torsion angle Noisy static angle
Real protein Noisy protein Remade protein
X-Ray
Torsion
Angles
extraction
PDB file
Torsion angles
3.24
2.45
56.13
-120.56
-75.25
170.10
17.90
…
Optimization
Process
a) b) c) d)
Optimized Torsion angles
3.11
2.41
54.13
-121.22
-72.45
170.47
16.90
…
e)
Remaking
Figure 3: (a) A real protein’s structure, (b) its PDB structure with noticeable noise in atom positions, (c) torsion angles
extracted from the PDB, (d) remade protein with very different structure because of cumulative noises.
Figure 2: Representation of a torsion angle in the bond b
from two points of view.
ciently/mathematically represented. All-atom 3D
coordinates, main-atom 3D coordinates, backbone
atoms coordinates and side-chain centroids, and
torsion angles are typical approached deployed for
this purpose. As a general rule, representations
based on 3D coordinates have the common problem
of not always being able to reconstruct feasible
proteins based on their restored 3D information. In
contrast, torsion angles can always represent valid
protein conformations when correct bond lengths
and static angles are available or assumed. Hence,
torsion angles are mainly used to reconstruct and
represent proteins. In this case, each amino acid has
3 torsion angles in the backbone (φ, ψ and ω) and a
variable number of torsion angles in the side-chain
(0 to 4 depending on the amino acid). Therefore,
for a medium-size protein with 60 amino acids, the
number of torsion angles can vary between 180 and
420.
The Protein Data Bank contains all known pro-
tein structures obtained by traditional procedures such
as X-Ray and NMR. Although these methods are as-
sumed to obtain/calculate proteins’ structures with
RMSD (Root Mean Square Deviation) of around 2
˚
A –depending on the size of the protein–, PDB files
always have some level of noise in their 3D coordi-
nates. Although such noise affects all atoms of a pro-
tein, overall shape of the constructed protein is usu-
ally fairly similar to the real protein. Figure 2 repre-
sents a torsion angle between three atom bonds a, b,
and c; and, equation 1 demonstrate how such torsion
angle is mathematically calculated. In this equation:
a, b and c are vectors in ℜ
3
, ‘×’ is the vectorial prod-
uct, ‘·’ is the dot product, and atan2 computes arc
tangent with two parameters and returns the principal
value of the arc tangent of y/x in radians.
φ = atan2(|b|a ·[b × c], [a × b]· [b ×c]) (1)
Although it seems fairly easy to reconstruct a pro-
tein based on its torsion angles, the affecting noises
in these torsion angles usually result in constructing
a protein with a considerably different 3D structure
compared with its real protein. Here, to represent ac-
curate 3D structure of all atoms for an amino acid with
more than 20 atoms, 60 real variables –three coordi-
nates per atom– is needed. Therefore, if only value of
five torsion angles are used to reconstruct this protein,
a large amount of information must be presumed. In
this case, reconstructing not only involves the use of
protein’s torsion angles but also fairly accurate pre-
sumptions of its known bond lengths (mostly fixed)
and angles.
This work presents a method to minimize
the difference between the original and the re-
made/reconstructed protein by optimizing torsion an-
gles so that they absorb most noises in known angles
and lengths. Thus, the optimized torsion angles can
be used to extract useful information to facilitate fu-
ture PSP procedures. To present our work, section
2 describes our procedure, section 3 demonstrate our
experimental results followed by conclusions in sec-
tion 4.
2 PROCEDURE FOR TORSION
ANGLES REFINEMENT
Whenever torsion angles mathematically ob-
tained/calculated are deployed with known angles and
bond lengths, the differences between 3D structure
of the original protein and its remade/reconstructed
BIOINFORMATICS 2011 - International Conference on Bioinformatics Models, Methods and Algorithms
298