08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

52 CHAPTER 3. MATHEMATICAL MODELING AND ALGORITHMS<br />

that a masterpeak is a set <strong>of</strong> peaks where each single peak stems from a distinct<br />

spectrum and represents a molecule (or peptide) <strong>of</strong> a certain weight. If we<br />

now find a masterpeak in two different groups at <strong>the</strong> same (m/z) position<br />

we have most likely found <strong>the</strong> same molecule (peptide) in <strong>the</strong> two groups.<br />

If <strong>the</strong>se masterpeaks differ significantly in height we could differentiate by<br />

<strong>the</strong>m. However, if <strong>the</strong> two masterpeaks have similar height we call <strong>the</strong>m nonin<strong>for</strong>mative<br />

and want <strong>the</strong>m to be tagged as such.<br />

Following this, we first have to match masterpeaks <strong>of</strong> two groups occurring<br />

at <strong>the</strong> same m/z position. Obviously, sometimes a masterpeak from group S1<br />

cannot be matched with a masterpeak from group S2 because this molecule<br />

(peptide) is just not present in S2. This process is called Masterpeak Alignment.<br />

We have implemented two different approaches to solve this problem, a<br />

naive version and an implementation as an Minimum Weight Maximum Cardinality<br />

Bipartite Matching <strong>for</strong>mulated as Linear Program and solved by <strong>the</strong><br />

Munkres (Hungarian) Algorithm which are compared in section 3.7.3.<br />

Approach 1: Naive Solution<br />

Algorithm 5 Masterpeak Assignment - Naive<br />

Require: Lists <strong>of</strong> masterpeaks MP1, MP2 <strong>of</strong> groups 1 & 2 respectively<br />

while MP1 has more elements do<br />

MP 1cur ← next element from MP1<br />

MP 2candidates : { s | m/z(s)−m/z(MP1)| ≤ 2}<br />

if |MP 2candidates| = 0 <strong>the</strong>n<br />

insert (MP 1cur, ∅) into LMP P AIRS<br />

else<br />

<strong>for</strong> all p ∈ MP 2candidates do<br />

insert (MP 1cur, p) into LMP P AIRS<br />

mark p (in list MP2) as processed<br />

end <strong>for</strong><br />

end if<br />

<strong>for</strong> all p ∈ MP2 do<br />

if p is not marked as processed <strong>the</strong>n<br />

insert (∅, p) into LMP P AIRS<br />

end if<br />

end <strong>for</strong><br />

end while<br />

return MP PAIRS: tuples <strong>of</strong> aligned pairs <strong>of</strong> masterpeaks.<br />

In this approach we simply check <strong>for</strong> masterpeaks in both groups that have<br />

similar m/z values. An obvious problem here is that peaks can be assigned<br />

more than once and no similarity measure is used to increase <strong>the</strong> quality <strong>of</strong><br />

<strong>the</strong> assignments.<br />

Approach 2: Bipartite Graph Matching<br />

In <strong>the</strong> second approach we re-<strong>for</strong>mulate <strong>the</strong> problem as a Minimum Weight<br />

Maximum Cardinality Bipartite Matching (assignment) problem. We are given<br />

<strong>the</strong> two sets <strong>of</strong> masterpeaks (MP1, MP2) which can be seen as vertices <strong>of</strong> a<br />

graph. This graph G = {V, E} is bipartite because <strong>the</strong> vertex set V is subdivided<br />

into two sets.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!