New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
110 CHAPTER 5. COMPUTER SCIENCE GRID STRATEGIES<br />
2. Pipeline oriented: A large workflow consisting <strong>of</strong> many different and<br />
dependent tasks is created and <strong>for</strong> each step different jobs are submitted<br />
to <strong>the</strong> queue<br />
According to <strong>the</strong> definition given by Workflow Management Coalition, a<br />
workflow is: “The automation <strong>of</strong> a business process, in whole or part, during<br />
which documents, in<strong>for</strong>mation or tasks are passed from one participant to<br />
ano<strong>the</strong>r <strong>for</strong> action, according to a set <strong>of</strong> procedural rules” (WfMC, 2007).<br />
A business process is <strong>the</strong> set <strong>of</strong> procedures required <strong>for</strong> obtaining a given<br />
result, <strong>the</strong>re<strong>for</strong>e workflow is <strong>the</strong> automation <strong>of</strong> a set <strong>of</strong> operations that allows<br />
obtaining a given result, with <strong>the</strong> exchange <strong>of</strong> in<strong>for</strong>mation among involved<br />
entities and with respect to defined procedural rules. The operations involved<br />
by a workflow are called activities; an activity is a part <strong>of</strong> <strong>the</strong> entire work and<br />
it represents a logical step in <strong>the</strong> process.<br />
Since in <strong>the</strong> data-oriented case <strong>the</strong> problem can be reduced to how to split<br />
up <strong>the</strong> data <strong>the</strong> pipeline-oriented case is considerably more difficult. Here, <strong>the</strong><br />
workflow must be defined, implemented, and executed. This issue <strong>of</strong> execution<br />
in a given order <strong>of</strong> complex computational tasks in a Grid environment has<br />
even been discussed by <strong>the</strong> “Grid Computing Environment Working Group”,<br />
in <strong>the</strong> Global Grid Forum (Hugh P. Bivens, 2001).<br />
To run a workflow into a Computational Grid it is necessary to define a<br />
language to be interpreted and manipulated automatically from a management<br />
system, which should allow defining a set <strong>of</strong> activities, <strong>the</strong>ir relation,<br />
involved entities (i.e. Applications, data resources, etc.) and some criteria <strong>for</strong><br />
determining <strong>the</strong> start and end <strong>of</strong> <strong>the</strong> processes.<br />
However, <strong>the</strong>re are many available workflow manager that fulfill <strong>the</strong>se tasks<br />
(such as Nimrod, Triana, Taverna, Pegasus, Proteus) but <strong>the</strong>y need to be<br />
installed on top <strong>of</strong> an existing Grid plat<strong>for</strong>m and are usually not part <strong>of</strong> <strong>the</strong>m<br />
(a nice exception is <strong>the</strong> myGrid project (Stevens et al., 2003)).<br />
5.1.3 Problems <strong>of</strong> Todays Grid Systems<br />
Nowadays most Grid systems are run by large research organizations, companies<br />
or governmental institutions such as NASA (In<strong>for</strong>mation Power Grid),<br />
US Department <strong>of</strong> Energy toge<strong>the</strong>r with IBM (Science Grid) or <strong>the</strong> European<br />
Union (EGEE). These organizations have dedicated staff who set up, configure,<br />
administrate and manage <strong>the</strong>se projects. This is still far away from trivial,<br />
making <strong>the</strong>se administrators vital to <strong>the</strong> task. As in <strong>the</strong> case <strong>of</strong> <strong>the</strong> Internet<br />
boom in <strong>the</strong> late 1990’s what is really is needed to fulfill <strong>the</strong> “everyone can<br />
use <strong>the</strong> Grid” vision is<br />
� Significantly reduce <strong>the</strong> complexity <strong>of</strong> installing and maintaining a Grid<br />
system.<br />
� Provide easy to use client s<strong>of</strong>tware that can compute jobs on Grid resources.<br />
� Allow <strong>for</strong> easy participation in a Grid system ei<strong>the</strong>r as user who wants<br />
to compute a large problem or to provide resources - this includes things<br />
such as registration and security models.<br />
� Merge <strong>the</strong> needed components such as <strong>the</strong> Grid plat<strong>for</strong>m, workflow manager<br />
and scheduler into some consistent product.<br />
� Allow <strong>for</strong> easy enabling <strong>of</strong> Grid technology to existing algorithms.