11.01.2013 Views

Algorithmes de prediction et de recherche de multi-structures d'ARN

Algorithmes de prediction et de recherche de multi-structures d'ARN

Algorithmes de prediction et de recherche de multi-structures d'ARN

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16 Chapter 1. Background – RNAs, non-coding RNAs, and bioinformatics<br />

for secondary structure <strong>prediction</strong> maximizing the number of base pairs was introduced later, in<br />

1978, by Nussinov <strong>et</strong> al. [84]. Base pair maximization is a very simple mo<strong>de</strong>l and does not take<br />

into account full energy param<strong>et</strong>ers. The same year, Waterman and Smith [125] presented a<br />

m<strong>et</strong>hod to predict the MFE structure through a simple nearest-neighbor mo<strong>de</strong>l. In 1981, Zuker<br />

and Stiegler [134] proposed a more elaborated dynamic programming algorithm, predicting the<br />

MFE structure. Later in the 1980’s, these folding algorithms were applied on the Turner energy<br />

mo<strong>de</strong>l [131].<br />

mfold/unafold [132] and RNAfold [51] are some popular implementations of these algorithms.<br />

For a sequence of length n, they run in time O(n 3 ) (or even O(n 4 ) if one takes into account interior<br />

loops of any size) and space O(n 2 ). D<strong>et</strong>ailed analysis of suboptimal <strong>prediction</strong>s, especially<br />

for mfold/unafold, will be done in Chapter 3. Recently, some optimizations in average less time<br />

and space have been proposed [8].<br />

1.2.4 Structure <strong>prediction</strong>, beyond the MFE structure<br />

Most of the time, computing the MFE structure is not enough:<br />

• Even if the Turner mo<strong>de</strong>l is well established, this empiric thermodynamic data can be incompl<strong>et</strong>e:<br />

slight changes in the thermodynamic mo<strong>de</strong>l could produce very different foldings<br />

with similar energy levels;<br />

• The Nearest Neighbor mo<strong>de</strong>l does not allow for pseudoknots and base tripl<strong>et</strong>s, and it<br />

does not reflect the interactions of the RNA with other molecules in the cell. Thus the<br />

projection of a real RNA structure on its secondary component may not be optimal;<br />

• In the cell, RNA molecules do not take only the most stable structure, but they seem to<br />

rapidly change their conformation b<strong>et</strong>ween <strong>structures</strong> with similar free energies. These<br />

changes can be induced by the interaction of the RNA with other molecules, and mostly<br />

concern their 3D structure. Some RNAs, including several mRNAs in their untranslated<br />

region, have even been shown to bear conformational switching in their secondary structure.<br />

Thus the MFE structure is not sufficient to fully un<strong>de</strong>rstand the folding possibilities of an<br />

RNA. Instead of g<strong>et</strong>ting the optimal structure, a compl<strong>et</strong>e backtracking through the dynamic<br />

programming table can generate all suboptimal secondary <strong>structures</strong> within a given threshold<br />

from minimum free energy. Being able to compute alternative <strong>structures</strong> may also be useful<br />

in <strong>de</strong>signing RNA sequences which not only have low folding energy, but whose folding landscape<br />

would suggest rapid and robust folding. Such suboptimal backtracking is implemented in<br />

RNAsubopt [127] using Waterman and Byers algorithm [124], whereas mfold does not implement<br />

this full backtrack (see Chapter 3).<br />

However, the s<strong>et</strong> of suboptimal <strong>structures</strong> is often huge : the number of such <strong>structures</strong><br />

is exponential in the length of the sequence. RNAshapes [105] suggests “abstract shapes” to<br />

classify the secondary <strong>structures</strong>. During computation of the suboptimal <strong>structures</strong>, RNAshapes<br />

partitions the folding space into s<strong>et</strong>s of <strong>structures</strong> according to their shapes.<br />

One may be interested by selecting only relevant <strong>structures</strong>, for example those corresponding<br />

to a local minimum of energy. The goal is to keep relevant suboptimal <strong>structures</strong> in the free<br />

energy landscape that correspond to local minima. The notions of saturated <strong>structures</strong> and<br />

locally optimal secondary <strong>structures</strong> me<strong>et</strong> these requirements.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!