This page contains ADP implementations of grammars G1 to G6 from
Dowell and Eddy's recent paper (R.D. Dowell and S.R. Eddy,
Evaluation of several lightweight stochastic context-free grammars
for RNA secondary structure prediction. BMC Bioinformatics, 5(71),
2004.):
Grammars:
G1 : S -> aSâ | aS | Sa | SS | e G4 : S -> aS | T | e
T -> Ta | aSâ | TaSâ
G2 : S -> aP^{aâ}â | aS | Sa | SS | e G5 : S -> aS | aSâ S | e
P^{oô} -> aP^{aâ}â | S
G3 : S -> aSâ | aL | Ra | LS G6 : S -> LS | L
L -> aSâ | aL L -> aFâ | a
R -> Ra | e F -> aFâ | LS
Note: In contrast to other grammars for RNA folding, these stochastic
grammars admit non-standard base pairing (with low probability).
To start, enter some input sequence, choose a grammar and a scoring
scheme and press the "Go!" - Button.
Description of the scoring schemes:
- viterbi: runs the Viterbi algorithm for the
grammar. Probabilities for paired and unpaired bases are taken from
Pfold server.
For grammars G1 to G5, only example rule probabilies are given. G6 is
the Knudsen/Hein grammar of the Pfold package. Here, the rule
probabilities are also taken from Pfold server.
- inside: runs the Inside algorithm with same probabilities as viterbi.
- count: calculates the number of derivations for the given
input sequence
- pairmax: base pair maximization
- dotbracket: gives the complete list of derivations in
dotbracket-notation
- trees: gives the complete list of derivations as
derivation trees
- A *** B: calculates cross product between scoring
schemes A and B. For example, viterbi *** dotbracket
calculates the best viterbi score together with the corresponding
structures in dotbracket-notation.
For usage on your local machine:
For the mechanical proof procedure you need an LR(k) parser
generator. We recommend MSTA, which can be downloaded at http://cocom.sourceforge.net/.
For our experiments we used version 0.995 (May 2002).
The file rna-grammars.tgz contains
MSTA input files of the grammars G1 to G8 from Dowell and Eddy's
paper.