STRUCTURE PREDICTION OF

SIMPLE NON-STANDARD PSEUDOKNOT

Thomas K. F. Wong and S. M. Yiu

Department of Computer Science, The University of Hong Kong, Hong Kong

Keywords:

RNA, Secondary structure prediction, Simple non-standard pseudoknot, Complex pseudoknot.

Abstract:

The secondary structure of an RNA molecule is known to be critical in its biological function. However, the

problem of predicting the secondary structure of an RNA molecule based on its primary sequence is compu-

tationally difﬁcult in the presence of pseudoknots. In general, the problem is NP-hard. Most of the existing

algorithms aim at restricted classes of pseudoknots. In this paper, we consider a new class of pseudoknot

structures, called simple non-standard pseudoknot, which can cover more complicated secondary structures

found in existing databases. None of the previous algorithms can handle this class of pseudoknots. Only two

of them, which run in O(m

) and O(m

) time where m is the length of the given RNA sequence, can han-

dle certain cases in this new class. On the other hand, we provide a prediction algorithm that runs in O(m

)

time for simple non-standard pseudoknots of degree 4 which already covers all known secondary structures of

RNAs in this class.

1 INTRODUCTION

RNA molecules are known to play important roles in

living cells and are involved in many biological pro-

cesses (Frank and Pace, 1998) (Nguyen et al., 2001)

(Yang et al., 2001). The structure of an RNA molecule

provides important information about the functions of

the molecule. Thus, ﬁnding the structure of an RNA

molecule is an important problem. Unfortunately,

ﬁnding or predicting the 3D (or tertiary) structure of

an RNA molecule is a complicated and time consum-

ing task. A more promising direction is to predict the

secondary structure (that is, which pair of bases of

the molecule forms a hydrogen bond) of the molecule

with minimum free energy based on its primary se-

quence. The predicted secondary structure can al-

ready help researchers to deduce the functions of the

molecule. However, predicting secondary structure

of an RNA molecule is not an easy problem and is

computationally difﬁcult in the presence of pseudo-

knots (base pairs crossing each other, secondary struc-

ture without pseudoknots are referred as regular struc-

tures). In general, the problem is NP-hard (Lyngso

and Pedersen, 2000). Most of the existing algorithms

aim at restricted classes of pseudoknots.

Pseudoknot structures can be classiﬁed as fol-

lows in increasing order of complexity: H pseudo-

knots (Lyngso and Pedersen, 2000), simple pseu-

doknots (Akutsu, 2000), standard pseudoknots (see

the deﬁnition in Section 2), recursive pseudoknots

(i.e., pseudoknot/regular structure inside a pseudo-

knot) (Akutsu, 2000). For the deﬁnitions of these

pseudoknot structures, please refer to the given ref-

erences.

Rivas and Eddy were among the ﬁrst who tack-

led the problem of RNA secondary structure predic-

tion with pseudoknots (Rivas and Eddy, 1999). They

described a dynamic programming algorithm to solve

the problem for simple pseudoknots, certain restricted

cases for standard pseudoknots and recursive pseudo-

knots, as well as some restricted cases in a more com-

plicated class of pseudoknots, simple non-standard

pseudoknots, to be deﬁned in Section 2. Their algo-

rithm runs in O(m

) time where m is the length of

the input RNA sequence. Their algorithm is still the

most powerful algorithm that can handle the largest

set of pseudoknot structures including some compli-

cated ones for which none of the existing polynomial-

time algorithms can handle.

Lyngso and Pedersen provided a faster algorithm

that runs in O(m

) time (Lyngso and Pedersen, 2000),

but can only handle H pseudoknots. Later on, Ue-

mura et al. gave an improved prediction algorithm

for H pseudoknots that runs in O(m

) time (Uemura

et al., 1999). The algorithm can, in fact, handle sim-

ple pseudoknots and some limited cases for standard

K. F. Wong T. and M. Yiu S. (2010).

STRUCTURE PREDICTION OF SIMPLE NON-STANDARD PSEUDOKNOT.

In Proceedings of the First International Conference on Bioinformatics, pages 33-38

DOI: 10.5220/0002718000330038

 SciTePress

pseudoknots and recursive pseudoknots. Their algo-

rithm is based on a tree adjoining grammar (TAG) to

model RNA secondary structures that include pseudo-

knots. However, tree adjoining grammar is not easy

to understand.

Akutsu provided another dynamic programming

algorithm that runs in O(m

) time for predicting sim-

ple pseudoknots (Akutsu, 2000). This algorithm is

much simpler than the algorithm given in (Uemura

et al., 1999). He also extended the algorithm to

support recursive simple pseudoknot structures (i.e.,

there can only be simple pseudoknots/regular struc-

tures inside another simple pseudoknot). The algo-

rithm in (Akutsu, 2000) was implemented and eval-

uated by Deogun (Deogun et al., 2004). Recently,

Chen et al. provided a faster algorithm that runs in

O(m

) time (Chen et al., 2009) that handles almost

all pseudoknot structures that can be handled by the

algorithm in (Rivas and Eddy, 1999). The ones that

cannot be handled by the algorithm in (Chen et al.,

2009) are those in which there are three base pairs that

cross one another. Other related work includes (Dirks

and Pierce, 2003) (Reeder and Giegerich, 2004) in

which the pseudoknots considered are also more re-

stricted than those in (Rivas and Eddy, 1999) (Chen

et al., 2009).

Figure 1: A complex psuedoknot structure present in Esh-

erichia coli α mRNA (Gluick and Draper, 1994). It consist

three base pairs that cross one another.

Figure 2: The two complex pseudoknot structures (a) and

(b) listed in (Roland, 2006). The gray box represents a set

of base pairs in the regions.

Our Contributions. In this paper, we consider

a more complicated class of pseudoknot structures,

called simple non-standard pseudoknots, which ex-

tend the current classiﬁcation of pseudoknots to cap-

ture more complicated pseudoknots and include some

cases for which three base pairs cross one another. For

example, it can include a complex structure (shown in

Figure 1) with three base pairs crossing one another

that is known to be a topology present in Escherichia

coli α mRNA (Gluick and Draper, 1994) and some

other complex structures as shownin Figure 2 listed in

(Dirks and Pierce, 2003). We provide an O(m

) time

algorithm for predicting simple non- standard pseu-

doknots of degree 4 which already include all known

structures in this class. Note that our algorithm can

handle all structures deﬁned in the class of simple

non-standard pseudoknots with degree 4 while algo-

rithms in (Rivas and Eddy, 1999) (Akutsu, 2000) can-

not. Our algorithm can be extended for general degree

k with running time O(m

2 SIMPLE NON-STANDARD

PSEUDOKNOTS

Let A = a

be a length-m RNA sequence with

alphabet {A, C, G, U} and M be a secondary struc-

ture of A. M can be represented as a set of

base pair positions, i.e., M = {(i, j)|1 ≤ i ≤ j ≤

m, (a

, a

)is a base pair}. Let M

x,y

⊆ M be the set

of base pairs within the subsequence a

x+1

...a

, 1 ≤

x < y ≤ m, i.e., M

x,y

= {(i, j) ∈ M|x ≤ i < j ≤ y}.

Note that M = M

1,m

. We assume that there is no

two base pairs sharing the same position, i.e., for any

, j

), (i

, j

) ∈ M, i

6= j

, i

6= j

, and i

= i

if and

only if j

= j

Pseudoknots are base pairs that cross each other.

For example, let (i, j) and (i, j), where i < j and i < j,

be two base pairs. They form a pseudoknot if i < i <

j < j or i < i < j < j. M

x,y

is a regular structure if

there does not exist pseudoknots. Note that an empty

set is also considered as a regular structure.

Figure 3: (a) A standard pseudoknot of degree k. (b) A

simple non-standard pseudoknot of degree k (Type I). (c) A

simple non-standard pseudoknot of degree k (Type II).

We now deﬁne a standard pseudoknot of degree k

BIOINFORMATICS 2010 - International Conference on Bioinformatics

as follows. A structure is a standard pseudoknot of

degree k if the RNA sequence can be divided into k

consecutive regions (see Figure 3(a)) such that base

pairs must have end points in adjacent regions and

base pairs that are in the same adjacent regions cannot

cross each other. The formal deﬁnition is as follows.

x,y

is a standard pseudoknot of degree k ≥ 3 if

there exists a set of pivot points x

, x

, ..., x

k−1

(x =

< x

< ... < x

k−1

< x

= y) that satisfy

the following. Let M

(1 ≤ w ≤ k − 1) = {(i, j) ∈

x,y

w−1

≤ i < x

≤ j < x

w+1

}. Note that we allow

j = x

for M

k−1

to resolve the boundary case.

• For each (i, j) ∈ M

x,y

, (i, j) ∈ M

for some 1 ≤

w ≤ k − 1.

• M

(1 ≤ w ≤ k− 1) is a regular structure.

Note that a standard pseudoknot of degree 3 is

simply referred as a simple pseudoknot. Now, we de-

ﬁne a simple non-standard pseudoknot that extends

the standard pseudoknot to include some structures

with three base pairs crossing each other. For a sim-

ple non-standard pseudoknot of degree k, similar to

a standard pseudoknot, the RNA sequence can be di-

vided into k regions with the region at one of the ends

(say, the right end) designated as the special region.

Base pairs with both end points in the ﬁrst k − 1 re-

gions have the same requirements as in a standard

pseudoknot. And there is an extra group of base pairs

that can start in one of the ﬁrst k − 2 regions and end

at the last special region and again these pairs do not

cross each other (see Figure 3(b)). See the formal def-

inition below.

x,y

is a simple non-standard pseudoknot of de-

gree k ≥ 4 (Type I) if there exist x

, ..., x

k−1

and t

where x = x

< x

< ... < x

k−1

< x

= y and 1 ≤

t ≤ k− 2 that satisfy the following. Let M

(1 ≤ w ≤

k − 2) = {(i, j) ∈ M

x,y

w−1

≤ i < x

≤ j < x

+ 1}.

Let X = {(i, j) ∈ M

x,y

t−1

≤ i < x

, x

k−1

≤ j ≤ y}.

• For each (i, j) ∈ M

x,y

, either (i, j) ∈ M

(1 ≤ w ≤

k− 2) or (i, j) ∈ X.

• M

and X is a regular structure.

Type II simple non-standard pseudoknots (see

Figure 3(c)) are symmetric to Type I simple nonstan-

dard pseudoknots with the special region on the left

end. As shown in Figure 4, the complex structures

in Figure 1 and Figure 2 belongs to the simple non-

standard pseudoknot structure. For the sake of sim-

plicity, in the rest of the paper, we only consider Type

I simple non-standard pseudoknots and simply refer

it as simple non-standard pseudoknots. So, we omit

the deﬁnition of Type II simple non-standard pseudo-

knots.

Figure 4: The complex pseudoknot structure (a) and (c) are

type I simple non-standard pseudoknots shown in (b) and

(d) respectively. Structure (e) is type II simple non-standard

pseudoknot shown in (f). The gray region represents a set

of base pairs and the green region represents a set of base

pairs in the special region of simple non-standard pseudo-

knot structure.

3 PROBLEM DEFINITION

In the following, we assume that the secondary struc-

ture of the given RNA sequence is of simple nonstan-

dard pseudoknot. We now deﬁne a free energy model

for a standard non-standard pseduoknot structure. We

use a similar model as deﬁned in (Akutsu, 2000).

The energy model depends on adjacent base pairs and

destabilizing energy. Free energy for adjacent base

pairs usually takes negative values and we consider

an energy function eS(a

, a

i+1

, a

j− 1

) depending on

adjacent base pairs (a

, a

) and (a

i+1

, a

j− 1

). On the

other hand, free energy for destabilizing energy usu-

ally is positive value and we consider a simple energy

function for destabilizing energy, which is determined

by the length of the unpaired bases. Also, Rivas and

Eddy considered an extra energy for initiation of a

new pseudoknot (Rivas and Eddy, 1999). Similarly

we also consider an additional energy for initiation of

a stem.

Roughly speaking, a stem is a pair of maximal re-

gions bounded by two base pairs such that no base

pair with one end inside the regions and another end

outside the region. A stem is deﬁned formally as two

non-overlapping regions [a

i+p

] and [a

j− q

...a

] such

that (i) (a

, a

) and (a

i+p

, a

j− q

) both are base pairs;

(ii) all base pairs (a

, a

) where i ≤ r ≤ i + p and

j − q ≤ s ≤ j do not cross each other; (iii) there does

not exist any base pair such that one end is inside the

regions, but another one is outside the regions; (iv)

the values of p and q are maximum. Note that every

base pair will belong to a stem. Figure 5 illustrates the

idea of stems inside the simple non-standard pseudo-

knot structure and Table 1 lists out the parameters of

STRUCTURE PREDICTION OF SIMPLE NON-STANDARD PSEUDOKNOT

the energy model. Our dynamic programming algo-

rithm (which will be described in the next section) is

designed to compute the minimum energy according

to this simple energy model. The algorithm can be

further extended to consider a more complicated en-

ergy model and include more parameters to increase

the accuracy of the structure prediction.

Figure 5: Dot lines represent base pairs. Two regions on

the RNA sequence covered by the same color are a stem

region. Say [6, 10] ∪ [23, 27] represents a stem region. In

this example, there are 4 stem regions.

Table 1: Parameters of the energy model.

Parameters Description

energy for initiation of a stem

inside a pseudoknot

eS(a

, a

′

, a

′

) energy of two adjacent

pairs closed by (i, j) and (i

′

, j

′

)

where |i

′

− i| = | j − j

′

| = 1

eL(k) energy of k unpaired bases

The problem is deﬁned as follows. Given an RNA

sequence, compute a secondary structure which is a

simple non-standard pseudoknot with minimum free

energy.

4 PREDICTION ALGORITHM

We predict the optimal structure with minimum free

energy using a dynamic programming algorithm. The

core of our algorithm is based on the concept of a sub-

region so that we can ﬁnd the optimal structure recur-

sively. In the following, we ﬁrst explain the concept

of subregion, then we provide the details of the al-

gorithm followed by the time and space complexity

analysis.

4.1 Subregion in Simple Non-standard

Pseudoknot

Since all known RNAs with simple non-standard

pseudoknots are of degree at most 4, in the following,

we only consider degree 4 simple non-standard pseu-

doknots. The subregion is deﬁned in a way such that

we do not have base pairs with one end inside the sub-

region and the other end outside the subregion, thus

enabling us to use dynamic programming approach

to solve the problem. To make it easy to understand

what a subregion is, we redraw the pseudoknot struc-

ture as in Figure 6(b). Based on the way we draw

the structure, it is easy to see that for the optimal sec-

ondary structure, base pairs can be ordered from top

to bottom without crossing each other.

We deﬁne a subregion using four points p, q, r, s

with x ≤ p < q < r < s ≤ y. An example is shown in

Figure 6(c) in which the highlighted part is the sub-

region deﬁned by the four points. Note that when

we predict the secondary structure, we do not actually

know the locations of x

, x

. So, we try all possible

combination of p, q, r, s to deﬁne subregions. These

points are added in Figure 6 to illustrate that there is

always a way to deﬁne a subregion so that base pairs

in the optimal structure will not have one end point

inside the subregion while the other end outside the

subregion. So, for each subregion we deﬁne in the dy-

namic programming algorithm, we will not have base

pairs having one end point inside the subregion and

the other end point outside the subregion.

Figure 6: Subregion of a simple non-standard pseudoknot.

The same deﬁnition of subregion cannot be ap-

plied when t is even. Figure 6(d) shows the problem

by using the same deﬁnition as there can be base pair

with one end point inside the subregion and the other

end point outside the subregion, thus dynamic pro-

gramming approach cannot be applied easily. Note

that the two base pairs that appear to cross each other

in Figure 6(d) is only due to the way we draw it, they

do not actually cross each other,so the structure is still

a simple non-standard pseudoknot. To solve the prob-

lem, we use a different deﬁnition for subregions when

t is even as shown in Figure 6(e). Formally speaking,

we deﬁne a subregion as follows. Let A[x..y] be an

BIOINFORMATICS 2010 - International Conference on Bioinformatics

RNA sequence. Let v = (p, q, r, s) be a quadruple with

x ≤ p < q < r < s ≤ y. If t is odd, deﬁne the subre-

gion R

odd

(v) = [p, q] ∪ [r, s]. For t is even, we need

an additional parameter x

and deﬁne the subregion

even

, v) = [p, q] ∪ [r, x

− 1] ∪ [s, y].

4.2 Dynamic Programming Algorithm

We ﬁrst show how to compute the minimum free en-

ergy of the optimal secondary structure for the case of

odd

Let S

odd

(p, q, r, s) be the minimum free energy

of the optimal secondary structure in R

odd

(p, q, r, s)

where Y ∈ {LP, RP, JP, LE, RE, JE,D} is one of the

possible cases to consider for having the optimal sec-

ondary structure inside R

odd

(p, q, r, s). These cases

are explained in the following. Note that according

to the deﬁnition of simple non-standard pseudoknot,

only (p, q), (q, r), and (p, s) can form a base pair. If

one of them forms a base pair, then we have one of

the following cases: (1) it starts a stem or expand a

stem, but not ends a stem; or (2) it is the last base pair

of a stem. If they form a base pair, then they will be

inside a stem. The following shows all these different

cases.

LP refers to the case that (p, q) forms a base pair

to start a stem or expand the stem, but not ends the

stem (i.e. it does not form the last base pair in the

stem which will be handled by the case D);

RP refers to the case that (q, r) forms a base pair to

start a stem or expand the stem, but not end the stem;

JP refers to the case that (p, s) forms a base pair to

start a stem or expand the stem, but not end the stem;

LE refers to the case that (p, q) is inside a stem but

does not form a base pair;

RE refers to the case that (q, r) is inside a stem but

does not forms base pair;

JE refers to the case that (p, s) is inside a stem but

does not form base pair;

D refers to the case that all bases x ∈ {p, q, r, s}

may either (i) contribute to the last base pair in some

stems, or (ii) do not form base pairs and do not belong

to any stem.

For any two bases a

, a

in the RNA sequence, let

v(a

, a

) = 0 if a

and a

can form a base pair, oth-

erwise let v(a

, a

) = +∞. The following shows how

to compute S

odd

(p, q, r, s) recursively by considering

all possible cases. For example, for S

odd

(p, q, r, s), if

(p, q) form a base pair, we consider the energy if it

starts a stem or expands a stem.

Recurrences:

odd

(p, q, r, s) = v(a

, a

) + min{START, EXPAND}

where START is the case for starting a stem and

EXPAND is the case for extending a stem.

START = S

odd

(p+ 1, q− 1, r, s) + e

EXPAND = min











odd

(p+ 1, q− 1, r, s)

+ eS(p, q, p+ 1, q− 1)

odd

(p+ 1, q− 1, r, s)

Similarly for S

odd

(p, q, r, s) and S

odd

(p, q, r, s).

odd

(p, q, r, s) = min











odd

(p+ 1, q, r, s) + eL(1)

odd

(p, q− 1, r, s) + eL(1)

odd

(p+ 1, q, r, s) + eL(1)

odd

(p, q− 1, r, s) + eL(1)

Similar for S

odd

(p, q, r, s) and S

odd

(p, q, r, s).

odd

(p, q, r, s) = min{CLOSE, BETWEEN}

where CLOSE refers to the closing of a stem and

BETWEEN refers to the case that it is not inside a

stem.

CLOSE = min











v(a

, a

) + S

odd

(p+ 1, q− 1, r, s)

+ eS(p, q, p+ 1, q− 1)

v(a

, a

) + S

odd

(p+ 1, q− 1, r, s)

v(a

, a

) + S

odd

(p, q− 1, r+ 1, s)

+ eS(q, r, q− 1, r+ 1)

v(a

, a

) + S

odd

(p, q− 1, r+ 1, s)

v(a

, a

) + S

odd

(p+ 1, q, r, s− 1)

+ eS(p, s, p + 1, s− 1)

v(a

, a

) + S

odd

(p+ 1, q, r, s− 1)

BETWEEN = min











odd

(p+ 1, q, r, s) + eL(1)

odd

(p, q− 1, r, s) + eL(1)

odd

(p, q, r + 1, s) + eL(1)

odd

(p, q, r, s− 1) + eL(1)

The recurrences for computing the minimum

free energy of the optimal secondary structure for

even

(p, q, r, s, x

) will be similar, but note the ad-

ditional parameter required for this case due to

the slightly different deﬁnition of subregions. Let

even

(p, q, r, s, x

) be the minimum free energy of

the optimal secondary structure in R

even

(p, q, r, s, x

)

where Y ∈ {LP, RP, JP, LE, RE, JE,D} is one of the

possible cases. The deﬁnitions of Y is the same as that

for S

odd

(p, q, r, s) except with (q, s) replacing (p, s) as

for this case (p, s) will not form a base pair, but (q, s)

can form a base pair.

STRUCTURE PREDICTION OF SIMPLE NON-STANDARD PSEUDOKNOT

The minimum free energy of the optimal structure

for the whole RNA sequence is the minimum value

of {min

odd

(1, x, x + 1, n)}, min

y<x

even

(1, y, y +

1, x

, x

)}}.

From the real data, the distance between x

and

the end of the sequence is usually bounded by a small

constant, so we assume that the number of different

values we need to consider is only a small con-

stant. The time complexity of the above algorithm is

O(m

). The memory complexity of the algorithm is

also O(m

) .

5 CONCLUSIONS

In this paper, we consider a new class of pseudoknots

which include more complicated structures that none

of the existing algorithms can handle. We then pro-

vide an O(m

) time algorithm for predicting these

a structure of degree 4 with minimum free energy

which already covers all known secondary structures

of this class in existing databases. We implemented

our algorithm and the running time is reasonable,

which takes about 70sec for a RNA of length about

100 and about 3 times faster than the one in (Rivas

and Eddy, 1999). We will evaluate the accuracy of

the predicted structures once we can locate a set of

appropriate parameters for the energy model. In fact,

there are not many known RNAs with simple non-

standard pseudoknots. One of the reasons may be due

to the limitation of existing computational prediction

tools. With our algorithm, we may be able to pre-

dict more RNAs with such a structure for follow-up

veriﬁcation. Although there are no other more com-

plicated known pseudoknot structures, there is a high

chance that there exist novelRNAs with more compli-

cated structures, so designing efﬁcient prediction al-

gorithms for more complicated pseudoknot structures

remains an important open problem.

REFERENCES

Akutsu, T. (2000). Dynamic programming algorithms for

rna secondary structure prediction with pseudoknots.

Discrete Applied Mathematics, 104:45–62.

Chen, H., Condon, A., and Jabbari, H. (2009). An o(n

)

algorithm for mfe prediction of kissing hairpins and

4- chains in nucleic acids. Journal of Computational

Biology, 16(6):803–815.

Deogun, J., Donis, R., Komina, O., and Ma, F. (2004).

Rna secondary structure prediction with simple pseu-

doknots. In Proceedings of the second confer-

ence on Asia- Paciﬁc Bioinformatics Conference

(APBC2004), pages 239–246.

Dirks, R. and Pierce, N. (2003). A partition function

algorithm for nucleic acid secondary structure in-

cluding pseudoknots. Journal of Comput. Chem,

24(13):1664–1677.

Frank, D. and Pace, N. (1998). Ribonuclease p: unity and

diversity in a trna processing ribozyme. Annu. Rev.

Biochem., 67:153–180.

Gluick, T. and Draper, D. (1994). Thermodynamics of

folding a pseudoknotted mrna fragment. Journal of

Molecular Biology, 241:246–262.

Lyngso, R. and Pedersen, C. (2000). A dynamic program-

ming algorithm for rna structure prediction including

pseudoknots. In Proc. of the Fourth Annual Interna-

tional Conferences on Compututational Molecular Bi-

ology (RECOMB00). ACM Press.

Nguyen, V., Kiss, T., Michels, A., and Bensaude, O. (2001).

7sk small nuclear rna binds to and inhibits the activity

of cdk9/cyclin t complexes. Nature, 414:322–325.

Reeder, J. and Giegerich, R. (2004). Design, implementa-

tion and evaluation of a practical pseudoknot folding

algorithm based on thermodynamics. BMC Bioinfor-

matics, 5:104.

Rivas, E. and Eddy, S. (1999). A dynamic programming

algorithm for rna structure prediction including pseu-

doknots. Journal of Molecular Biology, 285(5):2053–

2068.

Roland, E. (2006). Pseudoknots in rna secondary structures:

representation, enumeration, and prevalence. Journal

of Computational Biology, 13(6):1197–1213.

Uemura, Y., Hasegawa, A., Kobayashi, S., and Yokomori,

T. (1999). Tree adjoining grammars for rna structure

prediction. Theoretical Computer Science, 210:277–

303.

Yang, Z., Zhu, Q., Luo, K., and Zhou, Q. (2001). The 7sk

small nuclear rna inhibits the cdk9/cyclin t1 kinase to

control transcription. Nature, 414:317–322.

BIOINFORMATICS 2010 - International Conference on Bioinformatics