08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.4. HIGHLY SENSITIVE PEAK DETECTION 37<br />

isotopes 4 , <strong>the</strong> more complex a molecule is <strong>the</strong> more different varieties, with<br />

respect to mass, exist. For example, oxygen occurs in nature as three different<br />

(stable) isotopes: 16 O (99.765%; 8 protons, 8 neutrons), followed by <strong>the</strong> rare<br />

isotope 18 O (0.1995%; 8 protons, 10 neutrons) and <strong>the</strong> even rarer isotope 17 O<br />

(0.0355%; 8 protons, 9 neutrons ). Obviously, <strong>the</strong> more complex <strong>the</strong> molecule,<br />

<strong>the</strong> more combinations <strong>of</strong> isotopes (and hence masses) are possible. Since<br />

some combinations are more likely than o<strong>the</strong>rs, <strong>the</strong> isotopes are independent<br />

<strong>of</strong> each o<strong>the</strong>r, and <strong>the</strong>re is usually a high number <strong>of</strong> a particular molecule we<br />

see a Gaussian-like shape (Central limit <strong>the</strong>orem). Depending on <strong>the</strong> type <strong>of</strong><br />

machine used this shape can be resolved in its isotopic components (see Figure<br />

3.4.9).<br />

The knowledge <strong>of</strong> <strong>the</strong> isotope distributions enables us to exactly<br />

calculate <strong>the</strong> shape and position a molecules peak will (should) have.<br />

So if we find a peak-like shape at a certain position we can determine<br />

<strong>the</strong> similarity to <strong>the</strong> calculated shape and accept this shape or per<strong>for</strong>m<br />

fur<strong>the</strong>r analysis.<br />

3.4.2 Common Approaches<br />

Almost all peak detection algorithms rely on <strong>the</strong> shape comparison technique<br />

described above. They usually differ in how <strong>the</strong>y detect candidate<br />

peaks. What <strong>the</strong>y have in common is <strong>the</strong> usage <strong>of</strong> threshold driven detection<br />

techniques. That is, each candidate peak must be higher than<br />

a predetermined signal-to-noise threshold depending on <strong>the</strong> calculated<br />

noise level (see e.g. (McDonough and Whale, 1995)).<br />

Drawbacks <strong>of</strong> common Approaches<br />

As shown exemplarily in Fig. 3.4.10, by assuming a noise level <strong>of</strong> 50 5 and using<br />

a signal-to-noise ratio <strong>of</strong> 3 6 about 85% <strong>of</strong> <strong>the</strong> 1332 potential (candidate) peaks<br />

in this particular spectrum would be discarded and <strong>the</strong>ir assigned in<strong>for</strong>mation<br />

lost. Although most <strong>of</strong> <strong>the</strong>se peaks essentially are noise, some might carry<br />

important in<strong>for</strong>mation. This means, that <strong>the</strong>se artificially introduced barriers<br />

would prevent detection <strong>of</strong> small signals in a very early pre-processing stage.<br />

The subsequent sections describe our new approaches to overcome this<br />

signal-to-noise barrier, that means increasing sensitivity without decreasing<br />

specificity.<br />

3.4.3 Our Approach<br />

To avoid loss <strong>of</strong> potentially important in<strong>for</strong>mation by not considering (small)<br />

peaks in <strong>the</strong> preprocessing we take <strong>the</strong> most simple solution and regard everything<br />

as a candidate peak that has a start point Pi,s ∈ S and an end point<br />

Pi,e ∈ S, S = s2 . . . sn being <strong>the</strong> set <strong>of</strong> n points defining a spectrum. Then<br />

<strong>the</strong> tuple (Pi,s, Pi,e) defines <strong>the</strong> ith candidate peak ranging from Ss . . . Se. The<br />

requirements <strong>for</strong> <strong>the</strong>se points to meet are:<br />

4 Atoms with <strong>the</strong> same number <strong>of</strong> electrons and protons, but different numbers <strong>of</strong> neutrons,<br />

are called isotopes. Different isotopes belong to <strong>the</strong> same element because <strong>the</strong>y have <strong>the</strong> same<br />

number <strong>of</strong> electrons, which means that <strong>the</strong>y all behave almost <strong>the</strong> same in chemical reactions.<br />

It was discovered during <strong>the</strong> Second World War that isotopes <strong>of</strong> <strong>the</strong> same element can be<br />

separated by physical and chemical methods.<br />

5 different noise-estimators compute values ranging from 50 to 150<br />

6 a commonly used value to get reliable results<br />

Figure 3.4.9: Sample<br />

Spectrum. The inset<br />

shows a comparison <strong>of</strong> (a)<br />

experimental and (b) calculated<br />

isotope distribution<br />

patterns <strong>for</strong> <strong>the</strong> peak<br />

at m/z 811.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!