New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 5<br />
Computer Science Grid<br />
Strategies<br />
Contents<br />
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 103<br />
5.2 The Quasi Ad-hoc (QAD) Grid . . . . . . . . . . . . 111<br />
5.3 QAD Grid Plat<strong>for</strong>m Server . . . . . . . . . . . . . . 114<br />
5.4 QAD Grid Worker . . . . . . . . . . . . . . . . . . . . 129<br />
5.5 QAD Grid Plat<strong>for</strong>m Services . . . . . . . . . . . . . 138<br />
5.6 QAD Grid Workflows . . . . . . . . . . . . . . . . . . 141<br />
5.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . 146<br />
The previous chapters have introduced new methods <strong>for</strong> <strong>the</strong> analysis <strong>of</strong><br />
mass spectrometry data. The main advantage <strong>of</strong> our algorithms was increased<br />
sensitivity that - un<strong>for</strong>tunately - introduced more complex computations and<br />
increased amount <strong>of</strong> data. To speed up <strong>the</strong>se calculations we developed a new<br />
framework that can split up <strong>the</strong> analyses tasks and distribute <strong>the</strong>se sub-tasks<br />
to compute machines organized in an environment we call <strong>the</strong> quasi ad-hoc<br />
Grid. This framework and its advantages (e.g. over commonly used compute<br />
clusters) are described in <strong>the</strong> following sections.<br />
5.1 Introduction<br />
Driven by increasingly complex problems, new technologies and machines producing<br />
gigabytes <strong>of</strong> data each day, today’s science is based on computation<br />
power and data analysis as never be<strong>for</strong>e. But even as development <strong>of</strong> computer<br />
power and data storage continue to improve exponentially 1 , <strong>the</strong>se resources<br />
are failing to keep up with what scientists demand <strong>of</strong> <strong>the</strong>m. As an example,<br />
scientists back in 1990 were happy assembling small parts <strong>of</strong> DNA sequence<br />
in<strong>for</strong>mation <strong>of</strong> a chromosome. These days <strong>the</strong>y want to assemble <strong>the</strong> complete<br />
human genome and several physics projects such as CERN’s Large Hadron<br />
Collider, produce multiple petabytes <strong>of</strong> data per year. Of course, current<br />
desktop computers are much more powerful than supercomputers in <strong>the</strong> early<br />
1990s and <strong>the</strong> storage a PC ships with is comparable with an entire 1990<br />
1 According to Moore’s Law stating that semiconductor power doubles roughly every 18<br />
months.<br />
103