2 SURFACE ANALYSIS
SEGMENTATION THROUGH
THE CONVEX HULL
The CH of a molecule is the smallest convex
polyhedron that contains the molecule points. In R
3
the CH is constituted by a set of facets, that are
triangles, and a set of ridges (boundary elements)
that are edges. A practical O(n log n) algorithm for
general dimensions CH computing, is Quickhull
(Barber, 1996), that uses less space and executes
faster than most of the other algorithms.
The CH approach for molecular segmentation is
not new. The first paper applying this method, to
authors’ knowledge, is (Meier, 1995). The Quickhull
algorithm, is applied to the SAS which is defined by
the center of a water-sized probe sphere (usually
with radius values ranging between 1.4 and 1.8 Å)
‘rolling over the van der Waals surface of the atom’.
The technique is based on two specificities: i) the
tips lie directly on the CH surface: they are the
common points between molecule and CH surfaces;
ii) inlets and holes are ‘normally’ covered by large
facets of the CH surface. Both specificities are not
necessary conditions (the second is even not
sufficient) to establish the existence of true tips and
inlets: it is just a reasonable first hypothesis.
Moreover, the technique for tip segmentation is
based on a heuristic approach: each tip is extended
on the outside facet for a distance determined by a
global parameter.
Two other different approaches based on the CH
of the atoms centers have been proposed
by
Edelsbrunner (Edelsbrummer, 1998) and Xie (Xie,
2007). Both apply the Dealunay triangulation
technique, that in 3D have complex counterparts (3D
tetrahedrons), to evaluate quantitatively some
parameters.
The former, through a dual complex (alpha shape)
analysis, provides a quantitative description of the
microenvironments for protein structure based
design. In particular, volume and area of pockets,
area and circumferences of mouth opening, are
evaluated. For the pockets identification, it is used
the discrete flow method, that is the presence of
Delaunay’s tetrahedra disjointed to the dual
complex. In particular, for segmentation purposes,
some geometrical and topological rules allow the
discrimination between two neighboring tetrahedra,
satisfying the previous constraint. Note that, not all
the inlets are identified as pocket (the one for which
the discrete-flow pours to the outside of the CH).
The latter approach is based on a simplified
description that requires only the Cα atoms to
represent the protein structure in order to speed-up
the computation (making the new representation
“scalable to a large data set … yet robust enough to
handle the intrinsic properties of protein
flexibility”). Moreover, the notion of geometric
potential is introduced: this figure quantitatively
describes the microenvironment on the basis of two
heuristic parameters and allows a fast and effective
discrimination for active sites.
Figure 1: Common 2D representations of surface models
for protein’s molecules: i) in green the van der Waals
surface, directly produced from the atom’s locations
through the van der Waals radii; ii) in red the Solvent-
Excluded Surface SES (also known as the molecular
surface or Connolly surface) generated by the envelope of
a rolling sphere over the van der Waals surface (The
radius of the solvent sphere is usually set to the
approximate radius of a water molecule having a van der
Waals radius of 1.4 Å); iii) in blue the solvent accessible
surface (sometimes called the Lee-Richards molecular
surface) generated by the center of the solvent sphere
rolling over the van der Waals surface; iv) in brown the
convex hull, that coincides also with the SES having a
sphere with an infinite radius.
The CH is also the reference surface for the
molecules analysis based on the ‘travel depth’. The
travel depth parameter (Coleman, 2006 – Giard,
2008), with reference to the SES (see figure 1), is
defined as the shortest path accessible for a solvent
molecule between the protein convex hull and a
given point that belongs to the ‘active’ region of
interest (ROI) delimited by CH and SES. It
represents the physical distance that a ‘sufficiently’
small molecule has to travel to approach a surface
position (the pockets bottom are usually the points
of interest). It is particular the case of tunnels, i.e.
when pockets have no ‘bottom’, in which the
molecule can travel through the entire protein and
the travel depth is delimited by two points belonging
to the CH. In particular (Coleman, 2006) introduced
a technique for computing the travel depth on the
basis of a peculiar distance transform
implementation in the ROI defined above. The
Convex hull
Van der Waals surface
Solvent accessibile
Solvent-excluded
BIOINFORMATICS 2010 - International Conference on Bioinformatics
84