Decompose Reals Suppose you are given a DNA fragment of mass 1897.27 Dalton and no other
information. What nucleotide combinations are there that lead to exactly this mass?
Decomp helps you solve this and similar problems efficiently. Problems
like this (referred to as mass decomposition problems) often arise in mass
spectrometry, where the only information left about DNA, protein, or other sample
fragments is their molecular masss.
Decompose IntegersDecompose Integers function can be used to solve the money-changing problem(also called the coin change problem).
In-/Output values
INPUT :: masses (in Dalton) List of masses (Dalton) to decompose. If you have more than one mass, put
one per line. Example: 2053.3 2247.5
4525.7 If multiple masses are given, each of them is
decomposed separately. Please note: When using an amino acid alphabet,
only masses up to approximately 1600 Da can be decomposed in reasonable time!
(Using an allowed mass error of 0.1 Da. Using a lower mass error, larger masses
can be decomposed). If the computation takes too long, it will be cancelled.
INPUT :: masses List of masses (non negative integer) to decompose. If you have more than one mass, put
one per line, each mass is decomposed separately.
INPUT :: alphabetThe alphabet or, more precisely, the weighted alphabet tells Decomp into
which constituents it should decompose the given masses. Some useful alphabets are
predefined (as example) - Select Nucleotides to decompose DNA molecules into nucleotides.The
nucleotide alphabet consists of the four nucleotides Adenine, Cytosine,
Guanine, and Thymine.
- Select Amino acids to decompose proteins or peptides into amino
acids.The amino acid alphabet consists of 19 amino acids. Isoleucin, which
has the same mass as leucin, is not included.
- Select Atoms to decompose (organic) molecules into H, C, N, O, P, and
S.
For each of the three predefined alphabets, one version with a
monoisotopic and one version with an average isotopic mass distribution exist.
When writing your own custom alphabet, follow the given schema. More
precisely, an alphabet is a text file in which an alphabet character is defined
in each line like this:
name mass The name may be any sequence of
letters, digits, and symbols, but it should not start or end with a digit. If it
does, you will not get an error message, but you may get problems reading the
results. The mass must be a positive real number, possibly in scientific
notation. For example, 23, 42.89, 1e-5 and 2.3e7 are valid, but -7.3 and 0.0 are
not. Name and mass must be separated by space or tab. If a line starts with a #,
then the line is interpreted as a comment and ignored. Empty lines are also
ignored. Example (this is one of the predefined alphabets):
# nucleotide masses, monoisotopic distribution A
313.0576050 C 289.0463716 G 329.0525197 T
304.0460373
INPUT :: alphabetThe alphabet or, more precisely, the weighted alphabet tells Decomp into
which constituents it should decompose the given masses.
A alphabet must be defined as follows:
name mass The name may be any sequence of
letters, digits, and symbols, but it should not start or end with a digit. If it
does, you will not get an error message, but you may get problems reading the
results. The mass must be a positive integer number.
OUTPUT :: decomp output Mass decomposition output.
Parameter
Allowed Mass Error |
The allowed mass error gives a tolerance for the mass of the resulting
decompositions. If, for example, your input mass is 2053.3 Da and the allowed mass
error is set to 0.1 Da (absolute), then all decompositions with mass at least 2053.2
Da and at most 2053.4 Da will be computed. If any of the filtering options are
enabled, not all of them will be output, however. |
computational precision |
The last option in this area is the computational precision, which
determines how Decomp rounds real-valued masses to integers. This is necessary since
the decomposition algorithm that Decomp uses works for integers. For example, if an
input or alphabet mass is 1362.3418 Da, while the precision is set to 0.01 Da, then
the resulting integer mass will be 136234. Because Decomp determines this setting
automatically, you usually do not have to worry about it and leave it blank. |
Decomposition must contain at least |
If you know some constraints about the molecules you want to decompose, you
can use this option to show only the relevant results. For example, if you know,
when decomposing DNA, that there must be at least four Adenines, but at most six
Cytosines, type A4 into the text field next to "Decomposition must contain at least"
and type C6 into the field next to "Decomposition must contain at most". Multiple
constraints can also be given, separated by space. |
Decomposition must contain at most |
See description of 'Decomposition must contain at least' |
chemically plausible decompositions |
If this option is selected, a variant of the SENIOR rule is used to filter
out those decompositions that are chemically implausible. Let S be the the
sum of valences of all atoms in the decomposition and n be the number of atoms.
The two conditions are (from Kind and Fiehn):
Decomp knows valences for all of the atoms in the predefined alphabet
(CHNOPS). In addition, it knows the valences of Na, K, Cl, Si, Br, F, Mg, Fe,
and I. All of these elements can therefore be used in a custom alphabet while
the check for chemically plausible decompositions still works.
|
actual mass for each decomposition. |
If the option Show actual mass for each decomposition. is selected, Decomp
will include, for every decomposition, its actual mass. The actual mass may be
different from the query mass when the allowed mass error is set to a value greater
than zero. |
deviation from query mass for each decomposition |
The actual deviation will be shown when Show deviation from query mass for
each decomposition is selected. |
best decomposition per input mass |
Show at most up to the best given decompositions per input
mass. |
maximal number of decomposition |
Maximal number of decompositions (only used if 'find all' is selected. |
mass distribution |
|
mass error unit |
mass error unit, can be either Da(absolute) or ppm(relative)
|
modification of the entire molecule |
Select a modification of the entire molecule here. The effect is that,
before decomposition, the modification is 'undone' by either subtracting or adding
the appropriate mass to or from the input mass. |
fixed modification |
If a fixed modification is selected, it is assumed that all amino acids of
the given type have been modified. For example, if "Acetylation (M, +42 Da)" has
been selected, then the decomposition algorithm will assume that all Methionines (M)
have been acetylated. Internally, this is accomplished by changing the alphabet such
that the mass of a Methionine is increased by 42 Da. |
Variable modifications |
If a variable modification is selected, it is assumed that some amino acids
of the given type have been modified. This includes the special cases that none or
all have been modified. For example, if "Methylation (H, +14 Da)" has been selected,
then the decomposition algorithm will assume that all Histidines occurring in the
decomposition may be methylated or not. Internally, this is accomplished by adding
another Histidine (called H') to the alphabet that has a mass which is 14 Da greater
than the mass of the regular Histidine. In the output, regular and modified versions
can be distinguished since modified Histidines have a ' symbol appended whereas
regular ones appear normally. |
mass decomposition problem |
|
Overview Decomp's submission form is split into five areas. They are: - Masses: What are the masses of the molecules you want to
decompose?
- Alphabet: Into what components do you want to decompose them? We call this
the weighted alphabet (and just say alphabet from here on).
- Constraints: Is there anything known about the molecules?
- Modification: What posttranslational modifications are there?
- Output and Filtering: How should the output be presented?
Decomp needs to know at least the masses and the alphabet (corresponding to
the first two areas of the submission form). The three remaining areas are optional.
Masses
|
|