Sequence Analysis with Distributed Resources - RNA Secondary Structure

		Comparative RNA Structure Prediction Multiple sequences usually carry more information than each sequence alone. For RNA, compensating base mutations are a strong evidence for putative base pairs: If a putative G-C base pair mutates to an A-U base pair these positions are likely to pair. There are basically four possible ways how to use multiple sequences for structure prediction: Plan-A: First align multiple sequences, then fold the multiple sequence alignment into one secondary-structure. Pros uses established multiple sequence alignment methods. fast. Cons cannot garantee to find the global optimum. Plan-B: Simultanous align multiple sequences and find their optimal consensus secondary-structure. Pros exact algorithm Cons NP hard problem, thus very slow. Plan-C: First predict secondary-structure for each sequence and find one that is common to all sequences. Then structurally align the secondary-structure and derive a multiple structure/sequence alignment. Pros focuses on structures, not sequences as Plan A does. Cons cannot garantee to find the global optimum. Plan-D: Use covariance or mutual information. Pros stochastically sound method. Cons needs many examples for training phase. cannot garantee to find the global optimum.
		RNA Secondary Structure Comparison Many genes are controlled by small regulatory motifs in their 5' and 3' UTR. For some of these motifs, not the sequence but the secondary structure is important for proper function. It is often difficult to find these small motifs using sequence based methods, due to their low sequence conservation. A more appropriate way is the calculation of a local RNA structure alignment, i.e. to search for the most similar substructures inside larger RNA secondary structures. This task can be done with the RNAforester [Hoechsmann et al. 2003, Hoechsmann et al. 2004] which can compute global, as well as local RNA structural alignments.