Login Logged in as anonymous / My BiBiServ / Logout
Navigation
Decomp
Welcome
Submission
WebService
Download
Manual
References
Reset Session

Decompose Reals

Suppose you are given a DNA fragment of mass 1897.27 Dalton and no other information. What nucleotide combinations are there that lead to exactly this mass? Decomp helps you solve this and similar problems efficiently.
Problems like this (referred to as mass decomposition problems) often arise in mass spectrometry, where the only information left about DNA, protein, or other sample fragments is their molecular masss.

Decompose Integers

Decompose Integers function can be used to solve the money-changing problem(also called the coin change problem).

In-/Output values

INPUT :: masses (in Dalton)

List of masses (Dalton) to decompose. If you have more than one mass, put one per line. Example:

2053.3
2247.5
4525.7

If multiple masses are given, each of them is decomposed separately.

Please note: When using an amino acid alphabet, only masses up to approximately 1600 Da can be decomposed in reasonable time! (Using an allowed mass error of 0.1 Da. Using a lower mass error, larger masses can be decomposed). If the computation takes too long, it will be cancelled.

INPUT :: masses

List of masses (non negative integer) to decompose. If you have more than one mass, put one per line, each mass is decomposed separately.

INPUT :: alphabet

The alphabet or, more precisely, the weighted alphabet tells Decomp into which constituents it should decompose the given masses. Some useful alphabets are predefined (as example)
  • Select Nucleotides to decompose DNA molecules into nucleotides.The nucleotide alphabet consists of the four nucleotides Adenine, Cytosine, Guanine, and Thymine.
  • Select Amino acids to decompose proteins or peptides into amino acids.The amino acid alphabet consists of 19 amino acids. Isoleucin, which has the same mass as leucin, is not included.
  • Select Atoms to decompose (organic) molecules into H, C, N, O, P, and S.
For each of the three predefined alphabets, one version with a monoisotopic and one version with an average isotopic mass distribution exist.

When writing your own custom alphabet, follow the given schema. More precisely, an alphabet is a text file in which an alphabet character is defined in each line like this:
name mass
The name may be any sequence of letters, digits, and symbols, but it should not start or end with a digit. If it does, you will not get an error message, but you may get problems reading the results. The mass must be a positive real number, possibly in scientific notation. For example, 23, 42.89, 1e-5 and 2.3e7 are valid, but -7.3 and 0.0 are not. Name and mass must be separated by space or tab. If a line starts with a #, then the line is interpreted as a comment and ignored. Empty lines are also ignored.
Example (this is one of the predefined alphabets):
# nucleotide masses, monoisotopic distribution
A 313.0576050
C 289.0463716
G 329.0525197
T 304.0460373

INPUT :: alphabet

The alphabet or, more precisely, the weighted alphabet tells Decomp into which constituents it should decompose the given masses.
A alphabet must be defined as follows:
name mass
The name may be any sequence of letters, digits, and symbols, but it should not start or end with a digit. If it does, you will not get an error message, but you may get problems reading the results. The mass must be a positive integer number.

OUTPUT :: decomp output

Mass decomposition output.

Parameter

Name Description
Allowed Mass Error The allowed mass error gives a tolerance for the mass of the resulting decompositions. If, for example, your input mass is 2053.3 Da and the allowed mass error is set to 0.1 Da (absolute), then all decompositions with mass at least 2053.2 Da and at most 2053.4 Da will be computed. If any of the filtering options are enabled, not all of them will be output, however.
computational precision The last option in this area is the computational precision, which determines how Decomp rounds real-valued masses to integers. This is necessary since the decomposition algorithm that Decomp uses works for integers. For example, if an input or alphabet mass is 1362.3418 Da, while the precision is set to 0.01 Da, then the resulting integer mass will be 136234. Because Decomp determines this setting automatically, you usually do not have to worry about it and leave it blank.
Decomposition must contain at least If you know some constraints about the molecules you want to decompose, you can use this option to show only the relevant results. For example, if you know, when decomposing DNA, that there must be at least four Adenines, but at most six Cytosines, type A4 into the text field next to "Decomposition must contain at least" and type C6 into the field next to "Decomposition must contain at most". Multiple constraints can also be given, separated by space.
Decomposition must contain at most See description of 'Decomposition must contain at least'
chemically plausible decompositions If this option is selected, a variant of the SENIOR rule is used to filter out those decompositions that are chemically implausible.

Let S be the the sum of valences of all atoms in the decomposition and n be the number of atoms. The two conditions are (from Kind and Fiehn):

  • S must be even.
  • S >= 2n-2

Decomp knows valences for all of the atoms in the predefined alphabet (CHNOPS). In addition, it knows the valences of Na, K, Cl, Si, Br, F, Mg, Fe, and I. All of these elements can therefore be used in a custom alphabet while the check for chemically plausible decompositions still works.

actual mass for each decomposition. If the option Show actual mass for each decomposition. is selected, Decomp will include, for every decomposition, its actual mass. The actual mass may be different from the query mass when the allowed mass error is set to a value greater than zero.
deviation from query mass for each decomposition The actual deviation will be shown when Show deviation from query mass for each decomposition is selected.
best decomposition per input mass Show at most up to the best given decompositions per input mass.
maximal number of decomposition Maximal number of decompositions (only used if 'find all' is selected.
mass distribution
mass error unit mass error unit, can be either Da(absolute) or ppm(relative)
modification of the entire molecule Select a modification of the entire molecule here. The effect is that, before decomposition, the modification is 'undone' by either subtracting or adding the appropriate mass to or from the input mass.
fixed modification If a fixed modification is selected, it is assumed that all amino acids of the given type have been modified. For example, if "Acetylation (M, +42 Da)" has been selected, then the decomposition algorithm will assume that all Methionines (M) have been acetylated. Internally, this is accomplished by changing the alphabet such that the mass of a Methionine is increased by 42 Da.
Variable modifications If a variable modification is selected, it is assumed that some amino acids of the given type have been modified. This includes the special cases that none or all have been modified. For example, if "Methylation (H, +14 Da)" has been selected, then the decomposition algorithm will assume that all Histidines occurring in the decomposition may be methylated or not. Internally, this is accomplished by adding another Histidine (called H') to the alphabet that has a mass which is 14 Da greater than the mass of the regular Histidine. In the output, regular and modified versions can be distinguished since modified Histidines have a ' symbol appended whereas regular ones appear normally.
mass decomposition problem

Overview

Decomp's submission form is split into five areas. They are:
  • Masses: What are the masses of the molecules you want to decompose?
  • Alphabet: Into what components do you want to decompose them? We call this the weighted alphabet (and just say alphabet from here on).
  • Constraints: Is there anything known about the molecules?
  • Modification: What posttranslational modifications are there?
  • Output and Filtering: How should the output be presented?
Decomp needs to know at least the masses and the alphabet (corresponding to the first two areas of the submission form). The three remaining areas are optional.

Masses