New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 7<br />
Related Work<br />
In this chapter we will give a brief overview <strong>of</strong> o<strong>the</strong>r projects related to this<br />
<strong>the</strong>sis. Since related algorithms and concepts have been already discussed<br />
in <strong>the</strong> relevant chapters we will focus here on whole pipelines or frameworks<br />
<strong>for</strong> <strong>the</strong> analysis <strong>of</strong> protein MS TOF data. These pipelines can be roughly<br />
categorized into three groups:<br />
1. Collection <strong>of</strong> stand alone tools <strong>for</strong> data processing (e.g. peak picking,<br />
identification or quantitation)<br />
2. Integrated s<strong>of</strong>tware plat<strong>for</strong>ms that <strong>of</strong>fer data processing tools and data<br />
management functionality, usually providing a graphical user interface<br />
that allows <strong>for</strong> <strong>the</strong> assembly and execution <strong>of</strong> workflows. Since <strong>the</strong>se<br />
plat<strong>for</strong>ms run on a single machine <strong>the</strong>y are not well suited <strong>for</strong> <strong>the</strong> analysis<br />
<strong>of</strong> very large datasets.<br />
3. S<strong>of</strong>tware plat<strong>for</strong>ms that <strong>of</strong>fer data processing, data management and<br />
support distributed computation <strong>of</strong> <strong>the</strong>ir algorithms. Opposed to <strong>the</strong><br />
previous category <strong>the</strong>se frameworks are also well suited to handle very<br />
large datasets.<br />
Interestingly, <strong>the</strong> shift from development <strong>of</strong> stand-alone tools to integrated<br />
plat<strong>for</strong>ms has only become widespread since increase in data volume became<br />
an issue. Still, most <strong>of</strong> <strong>the</strong>se systems address only a part <strong>of</strong> <strong>the</strong> pipeline<br />
described in this <strong>the</strong>sis. High-throughput laboratories such as <strong>the</strong> Seattle<br />
Proteome Center (Trans-Proteomics Pipeline 1 , TPP (Kiebel et al., 2006)), <strong>the</strong><br />
Institute <strong>of</strong> Molecular Systems Biology at ETH Zürich (Superhirn, (Mueller<br />
et al., 2007)) or <strong>the</strong> Institute <strong>of</strong> Biomedical Engineering at Imperial College<br />
London (ProteomeGRID, (Dowsey et al., 2004)) have developed significant<br />
plat<strong>for</strong>ms with similar functionality. However, none <strong>of</strong> <strong>the</strong>m provides <strong>the</strong><br />
full range <strong>of</strong> <strong>the</strong> discovery process lifecycle including distributed computing<br />
support as our plat<strong>for</strong>m does.<br />
O<strong>the</strong>r s<strong>of</strong>tware packages such as mzMine 2 by Turku Centre <strong>for</strong> Biotechnology<br />
(Katajamaa et al., 2006), OpenMS 3 by Freie Universität <strong>Berlin</strong> (Kohlbacher<br />
et al., 2007) or XCMS 4 by Scripps Center <strong>for</strong> <strong>Mass</strong> Spectrometry (Smith et al.,<br />
2006) are not complete plat<strong>for</strong>ms (yet) since <strong>the</strong>re is no graphical user interface,<br />
workflow or data management functionality.<br />
1 http://tools.proteomecenter.org/wiki/index.php?title=S<strong>of</strong>tware:TPP<br />
2 http://mzmine.source<strong>for</strong>ge.net/<br />
3 http://www.openms.de<br />
4 http://metlin.scripps.edu/download/<br />
167