08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 5<br />

Computer Science Grid<br />

Strategies<br />

Contents<br />

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 103<br />

5.2 The Quasi Ad-hoc (QAD) Grid . . . . . . . . . . . . 111<br />

5.3 QAD Grid Plat<strong>for</strong>m Server . . . . . . . . . . . . . . 114<br />

5.4 QAD Grid Worker . . . . . . . . . . . . . . . . . . . . 129<br />

5.5 QAD Grid Plat<strong>for</strong>m Services . . . . . . . . . . . . . 138<br />

5.6 QAD Grid Workflows . . . . . . . . . . . . . . . . . . 141<br />

5.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . 146<br />

The previous chapters have introduced new methods <strong>for</strong> <strong>the</strong> analysis <strong>of</strong><br />

mass spectrometry data. The main advantage <strong>of</strong> our algorithms was increased<br />

sensitivity that - un<strong>for</strong>tunately - introduced more complex computations and<br />

increased amount <strong>of</strong> data. To speed up <strong>the</strong>se calculations we developed a new<br />

framework that can split up <strong>the</strong> analyses tasks and distribute <strong>the</strong>se sub-tasks<br />

to compute machines organized in an environment we call <strong>the</strong> quasi ad-hoc<br />

Grid. This framework and its advantages (e.g. over commonly used compute<br />

clusters) are described in <strong>the</strong> following sections.<br />

5.1 Introduction<br />

Driven by increasingly complex problems, new technologies and machines producing<br />

gigabytes <strong>of</strong> data each day, today’s science is based on computation<br />

power and data analysis as never be<strong>for</strong>e. But even as development <strong>of</strong> computer<br />

power and data storage continue to improve exponentially 1 , <strong>the</strong>se resources<br />

are failing to keep up with what scientists demand <strong>of</strong> <strong>the</strong>m. As an example,<br />

scientists back in 1990 were happy assembling small parts <strong>of</strong> DNA sequence<br />

in<strong>for</strong>mation <strong>of</strong> a chromosome. These days <strong>the</strong>y want to assemble <strong>the</strong> complete<br />

human genome and several physics projects such as CERN’s Large Hadron<br />

Collider, produce multiple petabytes <strong>of</strong> data per year. Of course, current<br />

desktop computers are much more powerful than supercomputers in <strong>the</strong> early<br />

1990s and <strong>the</strong> storage a PC ships with is comparable with an entire 1990<br />

1 According to Moore’s Law stating that semiconductor power doubles roughly every 18<br />

months.<br />

103

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!