New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3.4. HIGHLY SENSITIVE PEAK DETECTION 37<br />
isotopes 4 , <strong>the</strong> more complex a molecule is <strong>the</strong> more different varieties, with<br />
respect to mass, exist. For example, oxygen occurs in nature as three different<br />
(stable) isotopes: 16 O (99.765%; 8 protons, 8 neutrons), followed by <strong>the</strong> rare<br />
isotope 18 O (0.1995%; 8 protons, 10 neutrons) and <strong>the</strong> even rarer isotope 17 O<br />
(0.0355%; 8 protons, 9 neutrons ). Obviously, <strong>the</strong> more complex <strong>the</strong> molecule,<br />
<strong>the</strong> more combinations <strong>of</strong> isotopes (and hence masses) are possible. Since<br />
some combinations are more likely than o<strong>the</strong>rs, <strong>the</strong> isotopes are independent<br />
<strong>of</strong> each o<strong>the</strong>r, and <strong>the</strong>re is usually a high number <strong>of</strong> a particular molecule we<br />
see a Gaussian-like shape (Central limit <strong>the</strong>orem). Depending on <strong>the</strong> type <strong>of</strong><br />
machine used this shape can be resolved in its isotopic components (see Figure<br />
3.4.9).<br />
The knowledge <strong>of</strong> <strong>the</strong> isotope distributions enables us to exactly<br />
calculate <strong>the</strong> shape and position a molecules peak will (should) have.<br />
So if we find a peak-like shape at a certain position we can determine<br />
<strong>the</strong> similarity to <strong>the</strong> calculated shape and accept this shape or per<strong>for</strong>m<br />
fur<strong>the</strong>r analysis.<br />
3.4.2 Common Approaches<br />
Almost all peak detection algorithms rely on <strong>the</strong> shape comparison technique<br />
described above. They usually differ in how <strong>the</strong>y detect candidate<br />
peaks. What <strong>the</strong>y have in common is <strong>the</strong> usage <strong>of</strong> threshold driven detection<br />
techniques. That is, each candidate peak must be higher than<br />
a predetermined signal-to-noise threshold depending on <strong>the</strong> calculated<br />
noise level (see e.g. (McDonough and Whale, 1995)).<br />
Drawbacks <strong>of</strong> common Approaches<br />
As shown exemplarily in Fig. 3.4.10, by assuming a noise level <strong>of</strong> 50 5 and using<br />
a signal-to-noise ratio <strong>of</strong> 3 6 about 85% <strong>of</strong> <strong>the</strong> 1332 potential (candidate) peaks<br />
in this particular spectrum would be discarded and <strong>the</strong>ir assigned in<strong>for</strong>mation<br />
lost. Although most <strong>of</strong> <strong>the</strong>se peaks essentially are noise, some might carry<br />
important in<strong>for</strong>mation. This means, that <strong>the</strong>se artificially introduced barriers<br />
would prevent detection <strong>of</strong> small signals in a very early pre-processing stage.<br />
The subsequent sections describe our new approaches to overcome this<br />
signal-to-noise barrier, that means increasing sensitivity without decreasing<br />
specificity.<br />
3.4.3 Our Approach<br />
To avoid loss <strong>of</strong> potentially important in<strong>for</strong>mation by not considering (small)<br />
peaks in <strong>the</strong> preprocessing we take <strong>the</strong> most simple solution and regard everything<br />
as a candidate peak that has a start point Pi,s ∈ S and an end point<br />
Pi,e ∈ S, S = s2 . . . sn being <strong>the</strong> set <strong>of</strong> n points defining a spectrum. Then<br />
<strong>the</strong> tuple (Pi,s, Pi,e) defines <strong>the</strong> ith candidate peak ranging from Ss . . . Se. The<br />
requirements <strong>for</strong> <strong>the</strong>se points to meet are:<br />
4 Atoms with <strong>the</strong> same number <strong>of</strong> electrons and protons, but different numbers <strong>of</strong> neutrons,<br />
are called isotopes. Different isotopes belong to <strong>the</strong> same element because <strong>the</strong>y have <strong>the</strong> same<br />
number <strong>of</strong> electrons, which means that <strong>the</strong>y all behave almost <strong>the</strong> same in chemical reactions.<br />
It was discovered during <strong>the</strong> Second World War that isotopes <strong>of</strong> <strong>the</strong> same element can be<br />
separated by physical and chemical methods.<br />
5 different noise-estimators compute values ranging from 50 to 150<br />
6 a commonly used value to get reliable results<br />
Figure 3.4.9: Sample<br />
Spectrum. The inset<br />
shows a comparison <strong>of</strong> (a)<br />
experimental and (b) calculated<br />
isotope distribution<br />
patterns <strong>for</strong> <strong>the</strong> peak<br />
at m/z 811.