08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

32 CHAPTER 3. MATHEMATICAL MODELING AND ALGORITHMS<br />

in some <strong>of</strong> our test cases <strong>the</strong>y alter <strong>the</strong> relative peak intensities where subsequent<br />

analyses rely on (e.g. to obtain <strong>the</strong> relative compound concentrations<br />

correctly). This has also been found by by o<strong>the</strong>rs (e.g. (Li et al., 2007)).<br />

Our preliminary experiments have shown that non-parametric techniques are<br />

more appropriate. Out <strong>of</strong> <strong>the</strong>se approaches a wavelet shrinkage approach (<strong>for</strong><br />

a review see (Taswell, 2000); a nice introduction can be found here (Louis<br />

et al., 1998)) has per<strong>for</strong>med best on artificial data and data from spiked experiments<br />

where we knew <strong>the</strong> peak positions and intensities. This wavelet<br />

shrinkage approach has been reported to be very well suited <strong>for</strong> denoising <strong>of</strong><br />

mass spectrometry data (see e.g. (Ojanen et al., 2004; Liu, Sera, Matsubara,<br />

Otsuka and Terabe, 2003; Coombes et al., 2005)).<br />

Opposed to o<strong>the</strong>r denoising algorithms, such as moving average or lowpass<br />

filter (e.g. Savitzky-Golay), this approach utilizes <strong>the</strong> multi-scale nature<br />

<strong>of</strong> <strong>the</strong> signal and <strong>the</strong>re<strong>for</strong>e has better energy conservation properties, that is,<br />

<strong>the</strong> amplitude <strong>of</strong> <strong>the</strong> signal decreases less through denoising.<br />

The multi-scale method used here is based on a time-invariant discrete<br />

orthogonal wavelet trans<strong>for</strong>mation (Nason and Silverman, 1995) because orthogonal<br />

wavelets can give <strong>the</strong> most compact representation <strong>of</strong> a signal. During<br />

<strong>the</strong> wavelet trans<strong>for</strong>m at each scale high- and low-pass filters are applied to<br />

<strong>the</strong> signal. The output from a high-pass filter is recorded as <strong>the</strong> wavelet coefficients<br />

and represents <strong>the</strong> details <strong>of</strong> <strong>the</strong> signal. The low-pass filter extracts<br />

<strong>the</strong> low-frequency components which are used in <strong>the</strong> next stage where ano<strong>the</strong>r<br />

set <strong>of</strong> high- and low-pass filters is employed.<br />

Wavelet shrinkage denoising does involve non-linear s<strong>of</strong>t thresholding (shrinking)<br />

in <strong>the</strong> wavelet trans<strong>for</strong>m domain (Donoho, 1995). It is based on <strong>the</strong><br />

assumption that wavelet coefficients <strong>of</strong> <strong>the</strong> true signal have high amplitude,<br />

opposed to <strong>the</strong> lowest magnitude coefficients that represent <strong>the</strong> noise (see Figure<br />

3.3.5 <strong>for</strong> an example). Thus, by eliminating coefficients that are smaller<br />

than a predetermined threshold this noise can be removed. Summarized, it is a<br />

three step process: a linear <strong>for</strong>ward wavelet trans<strong>for</strong>m, a non-linear shrinkage<br />

denoising <strong>of</strong> <strong>the</strong> resulting coefficients and a linear inverse wavelet trans<strong>for</strong>m.<br />

None <strong>of</strong> <strong>the</strong>se steps needs a-priori parameterization <strong>of</strong> a particular model.<br />

We now give a more <strong>for</strong>mal definition. We assume we are given a noisy<br />

signal (<strong>the</strong> raw data) X consisting <strong>of</strong> m (noisy) samples<br />

X(t) = S(t) + ɛ(t), t = 1 . . . m<br />

This raw signal X is assumed to contain <strong>the</strong> real signal S with additive noise<br />

ɛ. Let W(·) denote <strong>the</strong> <strong>for</strong>ward and W −1 (·) <strong>the</strong> inverse wavelet trans<strong>for</strong>m<br />

operators and Ds(·, λ) be <strong>the</strong> denoising operator with a data adaptive s<strong>of</strong>t<br />

threshold λ. Our aim is to denoise X to estimate ˆ S which should be as close<br />

as possible to <strong>the</strong> original signal S. The three step process mentioned above<br />

is <strong>the</strong>n:<br />

� Linear <strong>for</strong>ward wavelet trans<strong>for</strong>m: Y = W(X). This results in m noisy<br />

wavelet coefficients yj,k.<br />

� Non-linear shrinkage denoising, that is thresholding <strong>the</strong> wavelet coefficients<br />

which includes estimating <strong>the</strong> threshold λ = {λ1, . . . , λj} <strong>for</strong> each<br />

level j (see below)<br />

– λ = d(Y )<br />

– Z = Ds(Y, λ)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!