Multiple RNA Interaction

Formulations, Approximations, and Heuristics

Saad Mneimneh

1,∗

, Syed Ali Ahmed

1,†

and Nancy L. Greenbaum

2,‡

Department of Computer Science, Hunter College, City University of New York, New York, U.S.A.

Department of Chemistry, Hunter College, City University of New York, New York, U.S.A.

Keywords:

Multiple RNA Interaction, RNA Structure, NP-hardness, Dynamic Programming, Approximation Algorithms,

Heuristics.

Abstract:

The interaction of two RNA molecules involves a complex interplay between folding and binding that war-

ranted recent developments in RNA-RNA interaction algorithms. However, biological mechanisms in which

more than two RNAs take part in an interaction exist. Therefore, we formulate multiple RNA interaction as a

computational problem, which not surprisingly turns out to be NP-complete. Our experiments with approxi-

mation algorithms and heuristics for the problem suggest that this formulation is indeed useful to determine

interaction patterns of multiple RNAs when information about which RNAs interact is not necessarily avail-

able (as opposed to the case of two RNAs where one must interact with the other), and because the resulting

RNA structure often cannot be predicated by existing algorithms when RNAs are simply handled in pairs.

1 INTRODUCTION

The interaction of two RNA molecules has been in-

dependently formulated as a computational problem

in several works, e.g. (Pervouchine, 2004; Alkan

et al., 2006; Mneimneh, 2009). In their most general

form, these formulations lead to NP-hard problems.

To overcome this hurdle, researchers have been either

reverting to approximation algorithms, or imposing

algorithmic restrictions; for instance, analogous to the

avoidance of pseudoknot formation in the folding of

RNAs.

While these algorithms had limited use in the be-

ginning, they became important venues for (and in

fact popularized) an interesting biological fact: RNAs

interact. For instance, micro-RNAs (miRNAs) bind

to a complementary part of messenger RNAs (mR-

NAs) and inhibit their translation (Meyer, 2008). One

might argue that such a simple interaction does not

present a pressing need for RNA-RNA interaction al-

gorithms; however, more complex forms of RNA-

RNA interaction exist. In E. Coli, CopA binds to the

ribosome binding site of CopT, also as a regulation

mechanism to prevent translation (Kolb et al., 2000);

∗

Supported by NSF Award CCF-AF 1049902.

†

Supported by the above and a CUNY Graduate Center

Science Fellowship.

‡

Supported by NSF Award MCB 0929394.

so does OxyS to fhlA (Argaman and Altuvia, 2000).

In both of these structures, the simultaneous folding

(within the RNA) and binding (to the other RNA) are

non-trivial to be predicted as separate events. To ac-

count for this, most of the RNA-RNA interaction al-

gorithms calculate the probability for a pair of subse-

quences (one of each RNA) to participate in the in-

teraction, and in doing so they generalize the energy

model used for the partition function of a single RNA

to the case of two RNAs (Muckstein et al., 2006; Chit-

saz et al., 2009a; Chitsaz et al., 2009b; Salari et al.,

2010; Huang et al., 2009; Li et al., 2010). This gen-

eralization takes into consideration the simultaneous

aspect of folding and binding.

Not surprisingly, there exist other mechanisms in

which more than two RNA molecules take part in an

interaction. Typical scenarios involve the interaction

of multiple small nucleolar RNAs (snoRNAs) with ri-

bosomal RNAs (rRNAs) in guiding the methylation

of the rRNAs (Meyer, 2008), and multiple small nu-

clear RNAs (snRNA) with mRNAs in the splicing

of introns (Sun and Manley, 1995). Even with the

existence of a computational framework for a sin-

gle RNA-RNA interaction, it is reasonable to believe

that interactions involving multiple RNAs are gener-

ally more complex to be treated pairwise. In addition,

given a pool of RNAs, it is not trivial to predict which

RNAs interact without some prior biological informa-

242

Mneimneh S., Ali Ahmed S. and L. Greenbaum N..

Multiple RNA Interaction - Formulations, Approximations, and Heuristics.

DOI: 10.5220/0004341402420249

In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOINFORMATICS-2013), pages 242-249

ISBN: 978-989-8565-35-8

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

tion.

To the best of our knowledge, no formulations

and/or algorithms exist for the problem of multiple

RNA interaction. Based on the above narrative, we

formulate this problem by bringing forward an op-

timization perspective where each part of an RNA

will contribute certain weights to the entire interaction

when binding to differentparts of other RNAs and we,

therefore, seek to maximize the total weight. This

notion of weight can be obtained by using existing

RNA-RNA interaction algorithms on pairs of RNAs.

We call our formulation the Pegs and Rubber Bands

problem. We show that under certain restrictions,

which are similar to those against pseudoknots, the

problem remains NP-hard (in fact it becomes equiv-

alent to a special instance of the interaction of two

RNAs). We describe a polynomial time approxima-

tion scheme PTAS for the problem, some heuristics,

and experimentalresults. For instance, given a pool of

RNAs in which interactions between pairs of RNAs

are known, our algorithm is capable of identifying

those pairs and predicting satisfactorily the pattern

of interaction between them (Chitsaz et al., 2009a).

Moreover, our algorithm ﬁnds the correct interaction

of a given instance of splicing consisting of two snR-

NAs (a modiﬁed U2-U6 human snRNA complex) and

two structurally autonomous parts of an intron (Zhao

et al., ), a total of four RNAs. When (partially) mix-

ing the two examples in one pool, our algorithm struc-

turally separates them.

2 PEGS AND RUBBER BANDS: A

FORMULATION

We introduce an optimization problem we call Pegs

and Rubber Bands that will serve a base framework

for the multiple RNA interaction problem. The link

between the two problems will be made shortly after

the description of Pegs and Rubber Bands.

Consider m levels numbered 1 to m with n

pegs

in level l numbers 1 to n

. There is an inﬁnite supply

of rubber bands that can be placed around two pegs

in consecutive levels. For instance, we can choose to

place a rubber band around peg i in level l and peg j

in level l + 1; we call it a rubber band at [l,i, j]. Every

such pair of pegs [l,i] and [l + 1, j] contribute their

own weight w(l,i, j). The Pegs and Rubber Bands

problem is to maximize the total weight by placing

rubber bands around pegs in such a way that no two

rubber bands intersect. In other words, each peg can

have at most one rubber band around it, and if a rub-

ber band is placed at [l,i

, j

] and another at [l,i

, j

then i

< i

⇔ j

< j

. We assume without loss

of generality that w(l,i, j) 6= 0 to avoid the unneces-

sary placement of rubber bands and, therefore, either

w(l,i, j) > 0 or w(l, i, j) = −∞. Figure 1 shows an

example.

formulate this problem by bringing forward an op-

timization perspective where each part of an RNA

will contribute certain weights to the entire interaction

when binding to different parts of other RNAs and we,

therefore, seek to maximize the total weight. This

notion of weight can be obtained by using existing

RNA-RNA interaction algorithms on pairs of RNAs.

We call our formulation the Pegs and Rubber Bands

problem. We show that under certain restrictions,

which are similar to those against pseudoknots, the

problem remains NP-hard (in fact it becomes equiv-

Figure 1: Pegs and Rubber Bands. All positive weights are

equal to 1 and are represented by dashed lines. The optimal

solution achieves a total weight of 8.

Given an optimal solution, it can always be re-

constructed from left to right by repeatedly placing

some rubber band at [l, i, j] such that, at the time

of this placement, no rubber band is around peg

[l, k] for k > i and no rubber band is around peg

[l + 1,k] for k > j. This process can be carried out

by a dynamic programming algorithm to compute

the maximum weight (in exponential time), by deﬁn-

ing W(i

,.. . ,i

) to be the maximum weight when

we truncate the levels at pegs [1,i

],[2, i

],.. . ,[m, i

]

(see Figure 2). The maximum weight is given by

W(n

,.. . ,n

) and the optimal solution can be ob-

tained by standard backtracking. When all levels have

O(n) pegs, this algorithm runs in O(mn

) time and

O(n

) space.

2.1 Multiple RNA Interaction as Pegs

and Rubber Bands

To provide some initial context we now describe how

the formulation of Pegs and Rubber Bands, though

in a primitive way, captures the problem of multiple

RNA interaction. We think of each level as an RNA

and each peg as one base of the RNA. The weight

w(l,i, j) corresponds to the negative of the energy

contributed by the binding of the i

base of RNA l

to the j

base of RNA l + 1. It should be clear, there-

fore, that an optimal solution for Pegs and Rubber

Bands represents the lowest energy conformation in

a base-pair energy model, when a pseudoknot-like re-

striction is imposed on the RNA interaction (rubber

bands cannot intersect). In doing so, we obviously

assume that an order on the RNAs is given with al-

ternating sense and antisense, and that the ﬁrst RNA

interacts with the second RNA, which in turn inter-

acts with the third RNA, and so on. We later relax

this ordering and condition on the interaction pattern

MultipleRNAInteraction-Formulations,Approximations,andHeuristics

243

W(i

,... ,i

) = max











W(i

− 1,i

,... ,i

)

W(i

− 1,i

,... ,i

)

W(i

,... ,i

m−1

− 1)

W(i

− 1,i

,... ,i

) + w(1,i

)

W(i

− 1,i

,... ,i

) + w(2,i

)

W(i

,... ,i

m−2

m−1

− 1,i

− 1) + w(m− 1,i

m−1

)

where W(0,0, ... ,0) = 0.

Figure 2: Dynamic programming algorithm for Pegs and Rubber Bands.

of the RNAs. While a simple base-pairing model is

not likely to give realistic results, our goal for the mo-

ment is simply to establish a correspondence between

the two problems.

2.2 Complexity of the Problem and

Approximations

With the above correspondence in mind, the problem

of Pegs and Rubber Bands can be viewed as a instance

of a classical RNA-RNA interaction, involving only

two RNAs that is: We construct the ﬁrst as RNA 1

followed by RNA 4 reversed followed by RNA 5 fol-

lowed by RNA 8 reversed and so on, and the second

as RNA 2 followed by RNA 3 reversed followed by

RNA 6 followed by RNA 7 reversed and so on, as

shown in Figure 3.

1 4 5 8 first RNA

3 6 7 second RNA2

Figure 3: Pegs and Rubber Bands as a special instance of

RNA-RNA interaction, vertical and curved lines indicate

the binding/folding pattern of interaction.

Therefore, Pegs and Rubber Bands can be solved

as an RNA-RNA interaction problem. While this

RNA-RNA interaction represents a restricted instance

of the more general NP-hard problem, it is still NP-

hard. In fact, Pegs and Rubber Bands itself is NP-

hard.

Theorem 1. Pegs and Rubber Bands is NP-hard.

Proof: We make a reduction from the longest

common subsequence (LCS) for a set of binary

strings, which is an NP-hard problem. In this reduc-

tion, pegs are labeled and w(l,i, j) depends only on

the label of peg [l, i] and the label of peg [l + 1, j]. We

describe this weight as a function of labels shortly.

Each binary string is modiﬁed by adding the sym-

bol b between every two consecutive bits. A string

of original length n is then transformed into two con-

secutive (identical) levels of 2n− 1 pegs each, where

each peg is labeled by the corresponding symbol in

{0,1,b}. For any given integer k, the ﬁrst and last

levels consist of k pegs labeled ∗. We now deﬁne the

weight as a function of labels: w(0,0) = w(1,1) =

w(b,b) = w(∗,0) = w(∗,1) = w(0,∗) = w(1,∗) = 1

and w(x,y) = −∞ otherwise. It is easy to verify that

the strings have a common subsequence of length k

if and only if the optimal solution has a weight of

∑

(2n

−1)+k = 2

∑

−m+k (when everypeghas a

rubber band around it), where n

is the original length

of string i and m is the number of strings.

* * * *

| | | |

0b0b1b0b1b1b1

|| | | | ||||

0b0b1b0b1b1b1

| | | |

0b1b0b1b0

| | | ||

0b1b0b1b0

| | | |

1b0b0b1b0b1

|||| | | |

1b0b0b1b0b1

| | | |

* * * *

Figure 4: Reduction from LCS for

{0010111,01010,100101} to Pegs and Rubber Bands

(the symbol | denotes a rubber band). The optimal solution

with weight 2(7 + 5 + 6) − 3 + 4 = 37 corresponds to a

common subsequence of length 4, namely 0101.

While our problem is NP-hard, we can show that

the same formulation can be adapted to obtain a poly-

nomial time approximation. A maximization prob-

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

244

lem admits a polynomial time approximation scheme

(PTAS) iff for every ﬁxed ε > 0 there is an algorithm

with a running time polynomial in the size of the input

that ﬁnds a solution within (1 − ε) of optimal (Cor-

men et al., ). We show below that we can ﬁnd a solu-

tion within (1 − ε) of optimal in time O(m⌈

⌉n

⌈

⌉

where m is the number of levels and each level has

O(n) pegs.

Theorem 2. Pegs and Rubber Bands admits a PTAS.

Proof: Let OPT be the weight of the opti-

mal solution and denote by W[i... j] the weight of

the optimal solution when the problem is restricted

to levels i,i + 1,..., j (a sub-problem). For a given

ε > 0, let k = ⌈

⌉. Consider the following k solutions

(weights), each obtained by a concatenation of opti-

mal solutions for sub-problems consisting of at most

k levels.

= W[1...1]+W[2.. .k+1]+W[k+2...2k+1]+...

= W[1...2]+W[3.. .k+2]+W[k+3...2k+2]+...

= W[1...k]+W[k+1. .. 2k]+W[2k+1... 3k]+. ..

It is easy to verify that every pair of consecu-

tive levels appear in exactly k − 1 of the above sub-

problems. Therefore,

∑

i=1

≥ (k − 1)OPT

⇒ max

≥

k− 1

OPT ≥ (1− ε)OPT

If m is the total number of levels, then there

are O(m) sub-problems of at most k levels each

and, therefore, the running time required to ﬁnd

max

when every level has O(n) pegs is O(mkn

) =

O(m⌈

⌉n

⌈

⌉

For a given integer k, the (1 − 1/k)-factor ap-

proximation algorithm is to simply choose the best

= W[1.. .i] +W[i+ 1...i + k] +W[i+ k+ 1...i +

2k] + ... as a solution, where W[i. .. j] denotes the

weight of the optimal solution for the sub-problem

consisting of levels i,i+1.. . , j. However, as a practi-

cal step, and instead of using the W

’s for the compar-

ison, we can ﬁll in for each W

some additional rubber

bands (interactions) between (RNAs) level i and level

i + 1, between level i + k and level i + k + 1, and so

on, by identifying the pegs of these levels (regions of

RNAs) that are not part of the solution. This does

not affect the theoretical guarantee but gives a larger

weight to the solution. We call it gap ﬁlling.

3 WINDOWS AND GAPS: A

BETTER FORMULATION FOR

RNA INTERACTION

In the previous section, we described our initial at-

tempt to view the interaction of m RNAs as a Pegs and

Rubber Bands problem with m levels, where the ﬁrst

RNA interacts with the second RNA, and the second

with the third, and so on (so they alternate in sense

and antisense). This used a simple base-pair energy

model, which is not too realistic. We now address this

issue (and leave the issues of the ordering and the in-

teraction pattern to the next section). A better model

for RNA interaction will consider windows of inter-

action instead of single bases. In terms of our Pegs

and Rubber Bands problem, this translates to placing

rubber bands around a stretch of contiguous pegs in

two consecutive levels, e.g. around pegs [l,i

], [l,i

[l + 1, j

], and [l + 1, j

], where i

≥ i

and j

≥ j

The weight contribution of placing such a rubber band

is now given by w(l,i

, j

,u, v), where i

and j

are

the last two pegs covered by the rubber band in level l

and level l + 1, and u = i

− i

+ 1 and v = j

− j

+ 1

represent the length of the two windows covered in

level l and level l + 1, respectively.

It is easy to verify that every pair of consecu-

1 of the above sub-

Figure 5: A rubber band can now be placed around a win-

dow of pegs, here u = 3 and v = 2 in the big window.

As a heuristic, we also allow for the possibility

of imposing a gap g ≥ 0 between windows as it may

be energetically favorable for interaction regions on

a single RNA not to be too close (which is not cap-

tured by the maximization of total weight). This gap

is also taken into consideration when we perform the

gap ﬁlling procedure described at the end of Section

2. The modiﬁed algorithm is shown in Figure 6, and

if we set u = v = 1 and g = 0, then we retrieve the

original algorithm of Figure 2.

The running time of the modiﬁed algorithm is

O(mw

) and O(mw

⌈

⌉n

⌈

⌉

) for the approxima-

tion scheme, where w is the maximumwindowlength.

If we impose that u = v, then those running times be-

come O(mwn

) and O(mw⌈

⌉n

⌈

⌉

) respectively.

For the correctness of the algorithm, we now have

to assume that windows are sub-additive. In other

words, we require the following condition (otherwise,

the algorithm may compute an incorrect optimum due

to the possibility of achieving the same window by

MultipleRNAInteraction-Formulations,Approximations,andHeuristics

245

W(i

,... ,i

) = max











W(i

− 1,i

,... ,i

)

W(i

− 1,i

,... ,i

)

W(i

,... ,i

m−1

− 1)

W((i

− u− g)

,(i

− v− g)

,... ,i

) + w(1,i

,u,v)

W(i

,(i

− u− g)

,(i

− v− g)

,... ,i

) + w(2,i

,u,v)

W(i

,... ,i

m−2

,(i

m−1

− u− g)

,(i

− v− g)

) + w(m− 1,i

m−1

,u,v)

where x

denotes max(0,x), w(l,i, j,u,v) = −∞ if u > i or v > j, 0 < u,v ≤ w (the maximum window size), g ≥ 0 (the gap), and W(0,0,...,0) = 0.

Figure 6: Modiﬁed dynamic programming algorithm for Pegs and Rubber Bands with the windows and gaps formulation.

two or more smaller ones with higher total weight):

w(l,i, j, u

) + w(l,i− u

, j − v

)

≤ w(l,i, j,u

+ u

+ v

)

In our experience, most existing RNA-RNA in-

teraction algorithms produce weights (the negative

of the energy values) of RNA interaction windows

that mostly conform to the above condition. In rare

cases, we ﬁlter the windows to eliminate those that

are not sub-additive. For instance, if the above condi-

tion is not met, we set w(l, i, j,u

) = w(l,i−u

, j−

) = −∞.

4 INTERACTION PATTERN AND

PERMUTATIONS: A

HEURISTIC

We now describe how to relax the ordering and the

condition on the interaction pattern of the RNAs.

We ﬁrst identify each RNA as being even (sense) or

odd (antisense), but this convention can obviously be

switched. Given m RNAs and a permutation on the

set {1,. .. ,m}, we map the RNAs onto the levels of

a Pegs and Rubber Bands problem as follows: We

place the RNAs in the order in which they appear

in the permutation on the same level as long as they

have the same parity (they are either all even or all

odd). We then increase the number of levels by one,

and repeat. RNAs that end up on the same level are

virtually considered as one RNA that is the concate-

nation of all. However, in the corresponding Pegs

and Rubber Bands problem, we do not allow win-

dows to span multiple RNAs, nor do we enforce a

gap between two windows in different RNAs. For ex-

ample, if we consider the following permutation of

RNAs {1,3, 4,7,5, 8,2,6}, where the RNA number

also indicates its parity (for the sake of illustration),

then we end up with the following placement: RNA

1 and RNA 3 in that order on the ﬁrst level, followed

by RNA 4 on the second level, followed by RNA 7

and RNA 5 in that order on the third level, followed

by RNA 8, RNA 2, and RNA 6 in that order on the

fourth level, resulting in four virtual RNAs on four

levels of pegs as shown in Figure 7.

---RNA 1---RNA 3---

---RNA 4---

---RNA 7---RNA 5---

---RNA 8---RNA 2---RNA 6---

Figure 7: Placement of the permutation {1, 3, 4, 7,5,8,2,6}

where the RNA number also indicates its parity. The in-

teraction pattern is less restrictive then before; for instance,

RNA 7 can interact with RNA 2, RNA 4, RNA 6, and RNA

But what is the best placement as a Pegs and Rub-

ber Bands problem for a given set of RNAs? Figure 8

shows a possible (greedy) heuristic that will attempt

to answer this question by ﬁnding the best permuta-

tion (and recall that the permutation uniquely deter-

mines the placement).

Given ε = 1/k and m RNAs

produce a random permutation π on {1,... ,m}

let W be the weight of the (1− ε)-optimal solution given π

repeat

better←false

generate a set Π of neighboring permutations for π

for every π

′

∈ Π (in any order)

let W

′

be the weight of the (1− ε)-optimal solution given π

′

if W

′

> W

then W ← W

′

π ← π

′

better←true

until not better

Figure 8: A heuristic for multiple RNA interaction using the

PTAS algorithm.

To generate neighboring permutations for this

heuristic algorithm one could adapt a standard 2-opt

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

246

OxyS 5’ ...CCCUUG... ...GUG... ...UCCAG... 3’

|||||| ||| |||||

fhlA 3’ ...GGGAAC... ...CAC... ...AGGUC... 5’

CopA 5’ ...CGGUUUAAGUGGG... ...UUUCGUACUCGCCAAAGUUGAAGA... ...UUUUGCUU 3’

||||||||||||| |||||||||||||||||||||||| ||||||||

CopT 3’ GCCAAAUUCACCC... ...AAAGCAUGAGCGGUUUCAACUUCU... ...AAAACGAA 5’

MicA 5’ ...GCGCA... ...CUGUUUUC... ...CGU... 3’

||||| |||||||| |||

lamB 3’ ...CGCGU... ...GAUAGAGG... ...GCA... 5’

Figure 9: Known pairs of interacting RNAs.

method used in the Traveling Salesman Problem (or

other techniques). For instance, given permutation π,

a neighboring permutation π

′

can be obtained by di-

viding π into three parts and making π

′

the concate-

nation of the ﬁrst part, the reverse of the second part,

and the third part. In other words, if π = (α,β,γ),

then π

′

= (α,β

,γ) is a neighbor of π, where β

is the

reverse of β.

5 EXPERIMENTAL RESULTS

We apply the algorithm of Section 4 using the 2-opt

method, where the PTAS is based on the Windows

and Gaps formulation of Section 3, with windows sat-

isfying 2 ≤ u,v ≤ w = 26 (RNAup’s default (Muck-

stein et al., 2006)) and a gap g = 4. The weights

w(l,i, j, u,v) are obtained as the negative of the en-

ergy values produced by RNAup. The windows are

ﬁltered for sub-additivity as described in Section 3.

In addition, we impose that u = v for every win-

dow. We also compress RNAs on level l by removal

of a base i whenever w(l,i, j,u, u) is less than some

threshold for every j (threshold 0 is used). However,

peg [l,i] can still be part of some window, e.g. if

w(l,i+ x, j,x+y,x+y) is added to the solution, where

x,y > 0. The purpose of the last two conditions (u = v

and compression) is to speed up the algorithm. We

pick the largest weight solution among several runs

of the algorithm. The value of k and the gap ﬁlling

criterion depend on the scenario, as described below.

5.1 Fishing for Pairs

Six RNAs of which three pairs are known to interact

are used (Chitsaz et al., 2009a). We are interested in

identifying the three pairs. For this purpose, it will

sufﬁce to set k = 2 and to ignore gap ﬁlling. Fur-

thermore, we only consider solutions in which each

RNA interacts with at most one other RNA. The so-

lution with the largest weight identiﬁes the three pairs

correctly (Figure 9). In addition, the interacting sites

in each pair are consistent with the predictions of ex-

isting RNA-RNA interaction algorithms, e.g. (Salari

et al., 2010).

5.2 Structure Prediction

The human snRNA complex U2-U6 is necessary for

the splicing of a speciﬁc mRNA intron (Zhao et al., ).

Only the preserved regions of the intron are consid-

ered, which consist of two structurally autonomous

parts, resulting in an instance with a total of four

RNAs. The algorithm is performed with k = 2, 3,4

and gap ﬁlling. In all three cases, the solution with the

largest weight consistently ﬁnds the structure shown

in Figure 10. This structure reveals a correct pat-

tern described in (Sun and Manley, 1995; Zhao et al.,

), and cannot be easily predicted by considering the

RNAs in pairs; for instance, AUAC in U6 will bind

to UAUG in both U2 and I1, and it is not immediately

obvious which one to break without a global view, e.g.

that AUGAU in U2 binds with UACUA in I2 as well.

This is a typical issue of local versus global optimal.

5.3 Structural Separation

Six RNAs are used: CopA, CopT, and the four RNAs

of the previous scenario. The algorithm is performed

with k = 3 and gap ﬁlling. The solution with the

largest weight results in a successful prediction that

separates the RNA complex CopA-CopT of Figure 9

from the RNA structure shown in Figure 10.

MultipleRNAInteraction-Formulations,Approximations,andHeuristics

247

I1 3’ UGUAUG 5’

||||

U6 5’ AUACA.....GAUUa... ...cGUGAAGCGU 3’

|||| |||||||||

U2 3’ UAUGAUg....CUAGAAu..........gCACUUCGCA 5’

|||||

I2 5’ UACUAAc 3’

Figure 10: A modiﬁed human snRNA U2-U6 complex in the splicing of an intron, as reported in (Zhao et al., ). Bases

indicated by small letters are missing from the interaction. From left to right: g-c and a-u are missing due to the condition

2 ≤ u = v ≤ 26, but also due to the added instability of a bulge loop when this condition is relaxed; c-g ends up being not

favored by RNAup. I1 is shifted (UGU should interact with ACA instead) but this is a computational artifact of optimization

that is hard to avoid. Overall, the structure is accurate and cannot be predicted by a pairwise handling of the RNAs.

6 CONCLUSIONS

While RNA-RNA interaction algorithms exist, they

are not suitable for predicting RNA structures in

which more than two RNA molecules interact. For

instance, the interaction pattern may not be known,

in contrast to the case of two RNAs where one must

interact with the other. Moreover, even with some ex-

isting knowledge on the pattern of interaction, treat-

ing the RNAs pairwise may not lead to the best global

structure. In this work, we formulate multiple RNA

interaction as an optimization problem, prove it is

NP-complete, and provide approximation and heuris-

tic algorithms. We explore three scenarios: 1) ﬁsh-

ing for pairs: given a pool of RNAs, we identify the

pairs that are known to interact; 2) structure predic-

tion: we predict a correct complex of two snRNAs

(modiﬁed human U2 and U6) and two structurally au-

tonomous parts of an intron, a total of four RNAs;

and 3) structural separation: we successfully divide

the RNAs into independent groups of multiple inter-

acting RNAs.

REFERENCES

Alkan, C., Karakoc, E., Nadeau, J. H., Sahinalp, S. C., and

Zhang, K. (2006). Rna-rna interaction prediction and

antisense rna target search. In Journal of Computa-

tional Biology 13(2).

Argaman, L. and Altuvia, S. (2000). fhla repression by

oxys: Kissing complex formation at two sites results

in a stable antisense-target rna complex. In Journal of

Molecular Biology 300.

Chitsaz, H., Backofen, R., and Sahinalp, S. C. (2009a).

birna: Fast rna-rna binding sites prediction. In 9

International Conference on Algorithms in Bioinfor-

matics.

Chitsaz, H., Salari, R., Sahinalp, S. C., and Backofen, R.

(2009b). A partition function algorithm for interacting

nucleic acid strands. In Journal of Bioinformatics.

Cormen, T., Leiserson, C. E., Rivest, R. L., and Stein,

C. Approximation Algorithms in Introduction to Al-

gorithms. MIT Press.

Huang, F. W. D., Qin, J., Reidys, C. M., and Stadler, P. F.

(2009). Partition function and base pairing probabil-

ities for rna-rna interaction prediction. In Journal of

Bioinformatics 25(20).

Kolb, F. A., Malmgren, C., Westhof, E., Ehresmann, C.,

Ehresmann, B., Wagner, E. G. H., and Romby, P.

(2000). An unusual structure formed by antisense-

target rna binding involves an extended kissing com-

plex with a four-way junction and a side-by-side heli-

cal alignment. In RNA Society.

Li, A. X., Marz, M., Qin, J., and Reidys, C. M. (2010). Rna-

rna interaction prediction based on multiple sequence

alignments. In Journal of Bioinformatics.

Meyer, I. M. (2008). Predicting novel rna-rna interactions.

In Current Opinions in Structural Biology 18.

Mneimneh, S. (2009). On the approximation of opti-

mal structures for rna-rna interaction. In IEEE/ACM

Transactions on Computational Biology and Bioinfor-

matics.

Muckstein, U., Tafer, H., Hackermuller, J., Bernhart, S. H.,

Stadler, P. F., and Hofacker, I. L. (2006). Thermody-

namics of rna-rna binding. In Journal of Bioinformat-

ics.

Pervouchine, D. D. (2004). Iris: Intermolecular rna inter-

action search. In 15

International Conference on

Genome Informatics.

Salari, R., Backofen, R., and Sahinalp, S. C. (2010). Fast

prediction of rna-rna interaction. In Algorithms for

Molecular Biology 5(5).

Sun, J. S. and Manley, J. L. (1995). A novel u2-u6 snrna

structure is necessary for mammalian mrna splicing.

In Genes and Development 9.

Zhao, C., Bachu, R., Popovic, M., Devany, M., Brenowitz,

M., Schlatterer, J. C., and Greenbaum, N. L. Con-

formational heterogeneity of the protein-free human

spliceosomal u2-u6 snrna complex. Under revision at

RNA. The ﬁrst two authors contributed equally to the

work.

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

248

APPENDIX: RNA SEQUENCES

MicA (even)

5’ GAAAGACGCGCAUUUGUUAUCAUCAUCCCUGUUUUCAGC

GAUGAAAUUUUGGCCACUCCGUGAGUGGCCUUUU 3’

lamB (odd)

5’ GGCAGCGCAUGUCGUCGUCUGCAUCAAGAGCCGGGUGUU

UAAGGCCUCCAUAAAAAAACGAAACGCAAAACCAUUCGC

AGUUUUAGAAGGUGGCAGCGUUUAAAGAAAAGCAAUGAU

CUCAGGAGAUAGAAUGAUGAUUACUCUGCGCAAACUCCC

ACUGGCGGUUGCUGUCGCAGCGG 3’

CopA (even)

5’ AUAGCUGAAUUCUUGGCUAUACGGUUUAAGUGGGCCCCG

GUAAUCUUUUCGUACUCGCCAAAGUUGAAGAAGAUUAUC

GGGGUUUUUGCUU 3’

CopT (odd)

5’ AAGCAAAAACCCCGAUAAUCUUCUUCAACUUUGGCGAGU

ACGAAAAGAUUACCGGGGCCCACUUAAACCG 3’

OxyS (even)

5’ GAAACGGAGCGGCACCUCUUUUAACCCUUGAAGUCACUG

CCCGUUUCGAGAGUUUCUCAACUCGAAUAACUAAAGCCA

ACGUGAACUUUUGCGGAUCUCCAGGAUCCGCU 3’

fhlA (odd)

5’ AGUUAGUCAAUGACCUUUUGCACCGCUUUGCGGUGCUUU

CCUGGAAGAACAAAAUGUCAUAUACACCGAUGAGUGAUC

UCGGACAACAAGGGUUGUUCGACAUCACUCGGACA 3’

I1 (odd)

5’ NNNNNNNNNNGUAUGUNNNNNNNNNN 3’

U6 (even)

5’ AUACAGAGAAGAUUAGCAUGGCCCCUGCGCAAGGAUGAC

ACGCAAAUUCGUGAAGCGU 3’

U2 (odd)

5’ ACGCUUCACGGCCUUUUGGCUAAGAUCAAGUGUAGUAU 3’

I2 (even)

5’ NNNNNNNNNNUACUAACNNNNNNNNNN 3’

MultipleRNAInteraction-Formulations,Approximations,andHeuristics

249