08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

146 CHAPTER 5. COMPUTER SCIENCE GRID STRATEGIES<br />

3. The workflow execution service stores <strong>the</strong> state <strong>of</strong> all current variables<br />

and <strong>the</strong> new active node to <strong>the</strong> database. This allows <strong>for</strong> resuming from<br />

this point after a system crash.<br />

4. The workflow execution service sets <strong>the</strong> returned node to active and<br />

starts over with step (1).<br />

Experiments have shown that <strong>the</strong> workflow execution service is idle most <strong>of</strong> <strong>the</strong><br />

time during execution. There<strong>for</strong>e, we have implemented this service as a multithreaded<br />

service: The main thread handles all database requests including<br />

scanning <strong>the</strong> database <strong>for</strong> new workflows to execute. If a new request is found<br />

and <strong>the</strong> overall system load is not too high a child-thread is created that<br />

handles this workflow. The main advantage <strong>of</strong> this approach is that we save<br />

connections to <strong>the</strong> database, since only <strong>the</strong> main thread needs to have an active<br />

database connection. This is beneficial since each open connection slows down<br />

<strong>the</strong> database server (see e.g. (Huddleston et al., 2006)).<br />

5.7 Related Work<br />

As mentioned in <strong>the</strong> previous sections, most systems that are used <strong>for</strong> building<br />

distributed computing systems or computational grids have evolved to quite<br />

complex s<strong>of</strong>tware frameworks. In Table 5.7.1 we list some exemplary Grid<br />

projects that are currently (2007/2008) active and have been studied by us 21 .<br />

There are roughly three categories <strong>the</strong>se projects fall into:<br />

1. e-Infrastructure: This refers to a research environment in which a researcher<br />

has shared access to scientific facilities (such as data, computing<br />

or sensors), regardless <strong>of</strong> <strong>the</strong>ir type and location in <strong>the</strong> world.<br />

2. Middleware: A middleware connects s<strong>of</strong>tware applications enabling exchange<br />

<strong>of</strong> data. Thus, it organizes and integrates <strong>the</strong> resources in a Grid.<br />

One <strong>of</strong> its main purpose is to automate <strong>the</strong> required machine to machine<br />

negotiations, such as negotiating <strong>the</strong> exchange <strong>of</strong> resources on behalf <strong>of</strong><br />

Grid users and resource providers. It also provides <strong>the</strong> core foundation<br />

(basic services) <strong>for</strong> grid applications including areas such as: security,<br />

resource management, in<strong>for</strong>mation services and data management.<br />

3. Application: These are projects within <strong>the</strong> context <strong>of</strong> specific scientific<br />

fields that are devoted to explore and harness grid technology. Depending<br />

on its state (fully operational vs. experimental) <strong>the</strong>se projects are<br />

called application or testbed.<br />

Since <strong>the</strong>re exists nei<strong>the</strong>r <strong>the</strong> exemplary Grid system nor a standard <strong>for</strong><br />

Grid systems (although systems like Globus seem to become <strong>the</strong> de-facto standard)<br />

we will not describe any system in great detail nor give a detailed comparison<br />

to our system 22 . Ra<strong>the</strong>r, we use <strong>the</strong> table to illustrates two things:<br />

(a) <strong>the</strong>re is a need <strong>for</strong> distributed computing systems in many application<br />

areas and (b) <strong>the</strong>re is a large variety <strong>of</strong> systems with different scales, aims, approaches<br />

and technological foundations. However, as far as our analyses have<br />

shown none <strong>of</strong> <strong>the</strong>m manages to provide a powerful and at <strong>the</strong> same time easy<br />

21 Ano<strong>the</strong>r even more exhaustive list can be found here (GridInfoware, 2008)).<br />

22 For a detailed comparison <strong>of</strong> four large Grid systems see e.g. (Asadzadeh et al., 2006)<br />

and references <strong>the</strong>rein.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!