Close window
Kopfgrafik

RNA structure and shape combinatorics

Algebra explanation


The individual meaning of the provided agebras is the following: Let x be an RNA sequence to be analyzed by a call to the grammar with an appropriate algebra parameter. This is what you get:
  • count: The counting algebra returns the number of candidates it sees, i.e. the size of the search space. Use count before you call the grammar on a long sequence with any of the enumerating algebras, as their output may be of exponential size!! And: NEVER use count as the left operand in a product.
  • enum: The enumeration algebra returns all structures of x in tree representation. Interpreting the tree operators as functions of another algebra B, you compute the value of this structure under algebra B. This helps with understanding the ADP machinery, and with trouble shooting.
  • pretty and pretty': The prettyprinting algebra returns the candidates it sees as dot-bracket strings. pretty returns them all, pretty' returns just an single example.
  • shapelevel _i for i = 1 .. 5: Shape abstraction algebras compute abstract shapes of the candidates they evaluate. Levels 1 -- 5 are consistent with their current use in the RNAshapes package.
  • userdefined: This is a shape abstraction algebra defined by user-specified parameters.
  • expect: The expectation algebra computes the expected number of structures of a sequence which has the same length and base composition as sequence x. The input sequence is inspected only for its length and composition -- the arrangement of bases is not used. This algebra, when used as the leftmost one in a product, turns off the base pair checking in the input sequence.
  • bpmax, spmax: These algebras provide base pair maximization and stacking pair maximization, where the latter is a better approximation of realistic energy rules.
  • bpmax(k), spmax(k): Same as above, but returning the k best scores under their scoring scheme.
Interesting things can be done with algebra products. Check your understanding: What do you expect to obtain from using

shapelevel5 *** bpmax *** count             on sequence x?

(You get all the level 5 shapes observed in the folding space of x, together with the maximum number of base pairs achieved by the structures in that shape class, and the number of such structures that achieve the maximal number of base pairs within their shape class. If this seems like magic to you, you are right. Still, you might want to look up the paper on algebra products when things seem to go crazy.)