New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
50 CHAPTER 3. MATHEMATICAL MODELING AND ALGORITHMS<br />
3.7 Identifying Potential Features<br />
This step identifies masterpeak (potential features) that can be used to discriminate<br />
two groups based on <strong>the</strong>ir respective properties (e.g. differences in<br />
average height, see Fig. 3.6.13). That is, we consider a masterpeak a feature if<br />
this masterpeak can be used to discriminate two sets <strong>of</strong> spectra. For example,<br />
if a masterpeak at position x does only occur in one <strong>of</strong> <strong>the</strong>se spectra sets it is<br />
a feature since <strong>the</strong> detection or absence <strong>of</strong> this peak would clearly assign it to<br />
one <strong>of</strong> <strong>the</strong> groups.<br />
3.7.1 Our Approach<br />
After <strong>the</strong> preprocessing steps we now have in<strong>for</strong>mation about masterpeaks<br />
<strong>of</strong> two patient (spectra) groups under scrutiny. To enable <strong>the</strong> creation <strong>of</strong><br />
fingerprints (see next section 3.8) we first need to create a set <strong>of</strong> potential<br />
differences between <strong>the</strong>se two groups <strong>of</strong> spectra. We define two spectra to be<br />
different (with respect to one particular property) if<br />
a) a masterpeak existent in one group does not occur in <strong>the</strong> o<strong>the</strong>r group<br />
b) a masterpeak exists in both groups but differs significantly in some property<br />
between <strong>the</strong> two groups.<br />
In o<strong>the</strong>r words, <strong>the</strong> feature detection step identifies a set <strong>of</strong> masterpeaks that<br />
differ significantly in particular properties (e.g. height, width) between two<br />
groups <strong>of</strong> spectra with respect to some metric. With <strong>the</strong>se in<strong>for</strong>mation we can<br />
subsequently analyze <strong>for</strong> sub-sets / patterns (fingerprints) by detection and<br />
subsequent selection <strong>of</strong> <strong>the</strong> most significant combination <strong>of</strong> features.<br />
Choosing <strong>the</strong> Metric<br />
A metric or distance function defines a distance between two elements <strong>of</strong> a<br />
set. The elements <strong>of</strong> our set are masterpeaks that are defined by property<br />
distributions <strong>of</strong> <strong>the</strong>ir assigned single peaks, such as m/z values, height or<br />
area. What we want is a distance function that equals to some very large<br />
number (or infinity) if it does not make sense to compare <strong>the</strong>m (that is, <strong>the</strong>ir<br />
respective m/z values are too different) or incorporates <strong>the</strong> (dis-)similarity <strong>of</strong><br />
<strong>the</strong>ir property distributions o<strong>the</strong>rwise.<br />
There<strong>for</strong>e, we need some (symmetric) method <strong>of</strong> measuring <strong>the</strong> similarity<br />
between two probability distributions which we found in <strong>the</strong> Jensen-Shannon<br />
(JS) divergence (see e.g. (Gómez-Lopera et al., 2000) and references <strong>the</strong>rein)<br />
because it can be computed quickly, has shown good results in similar applications<br />
and does not assume strong properties <strong>of</strong> <strong>the</strong> data, such as being<br />
normally distributed: For probability distributions ”P” and ”Q” <strong>of</strong> a discrete<br />
variable <strong>the</strong> JS-divergence <strong>of</strong> ”Q” from ”P” is defined as:<br />
Definition 3.7.1. Kullback-Leibler (KL) divergence (S. Kullback, 1951):<br />
DKL(P �Q) = �<br />
i<br />
DJS = 1<br />
2<br />
P (i) log P (i)<br />
Q(i) .<br />
Definition � 3.7.2. � �Jensen-Shannon<br />
� � (JS) � divergence �� (Lin, 1991):<br />
�<br />
�<br />
DKL P � + DKL Q�<br />
.<br />
P +Q<br />
2<br />
P +Q<br />
2<br />
Of course <strong>the</strong>re are o<strong>the</strong>r probability distance measures, <strong>for</strong> example histogram<br />
intersection (Jia et al., 2006), Kolmogorov-Smirnov distance (Fasano<br />
and Franceschini, 1987) or <strong>the</strong> earth mover’s distance (Rubner et al., 2000),<br />
but <strong>the</strong>se usually have quite strong requirements to <strong>the</strong> data.