08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

110 CHAPTER 5. COMPUTER SCIENCE GRID STRATEGIES<br />

2. Pipeline oriented: A large workflow consisting <strong>of</strong> many different and<br />

dependent tasks is created and <strong>for</strong> each step different jobs are submitted<br />

to <strong>the</strong> queue<br />

According to <strong>the</strong> definition given by Workflow Management Coalition, a<br />

workflow is: “The automation <strong>of</strong> a business process, in whole or part, during<br />

which documents, in<strong>for</strong>mation or tasks are passed from one participant to<br />

ano<strong>the</strong>r <strong>for</strong> action, according to a set <strong>of</strong> procedural rules” (WfMC, 2007).<br />

A business process is <strong>the</strong> set <strong>of</strong> procedures required <strong>for</strong> obtaining a given<br />

result, <strong>the</strong>re<strong>for</strong>e workflow is <strong>the</strong> automation <strong>of</strong> a set <strong>of</strong> operations that allows<br />

obtaining a given result, with <strong>the</strong> exchange <strong>of</strong> in<strong>for</strong>mation among involved<br />

entities and with respect to defined procedural rules. The operations involved<br />

by a workflow are called activities; an activity is a part <strong>of</strong> <strong>the</strong> entire work and<br />

it represents a logical step in <strong>the</strong> process.<br />

Since in <strong>the</strong> data-oriented case <strong>the</strong> problem can be reduced to how to split<br />

up <strong>the</strong> data <strong>the</strong> pipeline-oriented case is considerably more difficult. Here, <strong>the</strong><br />

workflow must be defined, implemented, and executed. This issue <strong>of</strong> execution<br />

in a given order <strong>of</strong> complex computational tasks in a Grid environment has<br />

even been discussed by <strong>the</strong> “Grid Computing Environment Working Group”,<br />

in <strong>the</strong> Global Grid Forum (Hugh P. Bivens, 2001).<br />

To run a workflow into a Computational Grid it is necessary to define a<br />

language to be interpreted and manipulated automatically from a management<br />

system, which should allow defining a set <strong>of</strong> activities, <strong>the</strong>ir relation,<br />

involved entities (i.e. Applications, data resources, etc.) and some criteria <strong>for</strong><br />

determining <strong>the</strong> start and end <strong>of</strong> <strong>the</strong> processes.<br />

However, <strong>the</strong>re are many available workflow manager that fulfill <strong>the</strong>se tasks<br />

(such as Nimrod, Triana, Taverna, Pegasus, Proteus) but <strong>the</strong>y need to be<br />

installed on top <strong>of</strong> an existing Grid plat<strong>for</strong>m and are usually not part <strong>of</strong> <strong>the</strong>m<br />

(a nice exception is <strong>the</strong> myGrid project (Stevens et al., 2003)).<br />

5.1.3 Problems <strong>of</strong> Todays Grid Systems<br />

Nowadays most Grid systems are run by large research organizations, companies<br />

or governmental institutions such as NASA (In<strong>for</strong>mation Power Grid),<br />

US Department <strong>of</strong> Energy toge<strong>the</strong>r with IBM (Science Grid) or <strong>the</strong> European<br />

Union (EGEE). These organizations have dedicated staff who set up, configure,<br />

administrate and manage <strong>the</strong>se projects. This is still far away from trivial,<br />

making <strong>the</strong>se administrators vital to <strong>the</strong> task. As in <strong>the</strong> case <strong>of</strong> <strong>the</strong> Internet<br />

boom in <strong>the</strong> late 1990’s what is really is needed to fulfill <strong>the</strong> “everyone can<br />

use <strong>the</strong> Grid” vision is<br />

� Significantly reduce <strong>the</strong> complexity <strong>of</strong> installing and maintaining a Grid<br />

system.<br />

� Provide easy to use client s<strong>of</strong>tware that can compute jobs on Grid resources.<br />

� Allow <strong>for</strong> easy participation in a Grid system ei<strong>the</strong>r as user who wants<br />

to compute a large problem or to provide resources - this includes things<br />

such as registration and security models.<br />

� Merge <strong>the</strong> needed components such as <strong>the</strong> Grid plat<strong>for</strong>m, workflow manager<br />

and scheduler into some consistent product.<br />

� Allow <strong>for</strong> easy enabling <strong>of</strong> Grid technology to existing algorithms.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!