RNA structure and shape combinatorics
Algebra explanation
The individual meaning of the provided agebras is the following: Let x
be an RNA sequence to be analyzed by a call to the grammar with an
appropriate algebra parameter. This is what you get:
- count: The counting
algebra returns the number of candidates it sees, i.e. the size of the
search space. Use count before you call the grammar on a long sequence
with any of the enumerating algebras, as their output may be of
exponential size!! And: NEVER use count
as the left operand in a product.
- enum: The enumeration
algebra returns all structures of x in tree
representation. Interpreting the tree operators as functions of another
algebra B, you compute the value of this structure under algebra B.
This helps with understanding the ADP machinery, and with trouble
shooting.
- pretty and pretty': The prettyprinting algebra
returns the candidates it sees as dot-bracket strings. pretty returns them all, pretty' returns just an single
example.
- shapelevel _i for i = 1
.. 5: Shape abstraction algebras compute abstract shapes of the
candidates they evaluate. Levels 1 -- 5 are consistent with their
current use in the RNAshapes package.
- userdefined: This is a
shape abstraction algebra defined by user-specified parameters.
- expect: The expectation
algebra computes the expected number of structures of a sequence which
has the same length and base composition as sequence x. The input
sequence is inspected only for its length and composition -- the
arrangement of bases is not used. This algebra, when used as the
leftmost one in a product, turns off the base pair checking in the
input sequence.
- bpmax, spmax: These algebras provide base
pair maximization and stacking pair maximization, where the latter is a
better approximation of realistic energy rules.
- bpmax(k), spmax(k): Same as above, but
returning the k best scores
under their scoring scheme.
Interesting things can be done with algebra products. Check your
understanding: What do you expect to obtain from using
shapelevel5 *** bpmax *** count
on sequence x?
(You get all the level 5 shapes observed in the folding space of x,
together with the maximum number of base pairs achieved by the
structures in that shape class, and the number of such structures that
achieve the maximal number of base pairs within their shape class. If
this seems like magic to you, you are right. Still, you might want to
look up the paper on algebra products when things seem to go crazy.)