08.02.2013 Views

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

New Statistical Algorithms for the Analysis of Mass - FU Berlin, FB MI ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.4. QAD GRID WORKER 133<br />

Figure 5.4.9: Time needed <strong>for</strong> <strong>the</strong> OS to cycle through <strong>the</strong> active thread (yield) vs.<br />

number <strong>of</strong> running threads. This shows that <strong>the</strong>re exists an almost linear relationship.<br />

Data was created on a 2.3GHz single-core machine (2GB RAM) running Debian Linux<br />

and Windows Vista, respectively.<br />

5.4.4 Worker Migration<br />

So far, we presented a system that can collect and manage jobs that are<br />

<strong>the</strong>n requested by workers running on some client machine per<strong>for</strong>ming <strong>the</strong><br />

computation. At <strong>the</strong> moment a new job becomes available it is easy <strong>for</strong> a<br />

client machine to decide whe<strong>the</strong>r to request or to skip it, given that a worker<br />

has knowledge about <strong>the</strong> current system state (e.g. CPU utilization) <strong>of</strong> its<br />

host computer. This concept is known as static scheduling, an early approach<br />

being <strong>the</strong> Linda system (Carriero and Gelernter, 1986).<br />

The problem now is that <strong>the</strong>se conditions can (and mostly will) change<br />

over time. So using static scheduling might be a good idea within a dedicated<br />

cluster environment but as soon as user’s workstations are integrated<br />

<strong>the</strong> following obvious problem arises: A user, say Alice, integrates her desktop<br />

workstation into <strong>the</strong> Grid and goes away <strong>for</strong> some time. Ano<strong>the</strong>r User, say<br />

Bob, creates a job in <strong>the</strong> Grid system that is <strong>the</strong>n requested by <strong>the</strong> worker<br />

running on Alice’s machine. The trouble comes if Alice returns to her workstation<br />

be<strong>for</strong>e Bob’s job has finished. If static scheduling is used, <strong>the</strong> options are<br />

limited: (a) we can allow Bob’s job to finish which will make Alice unhappy<br />

because her system will be slow or (b) terminate Bob’s job that will make Bob<br />

unhappy because he loses work already done.<br />

In <strong>the</strong> QAD Grid system it is <strong>of</strong> primary importance that an owner <strong>of</strong> a<br />

workstation does not pay penalty <strong>for</strong> integrating its machine into <strong>the</strong> Grid, that<br />

is running a worker on his/her machine. So, workers must have <strong>the</strong> ability to<br />

vacate a workstation (or at least pause computation) if a users starts working<br />

on that machine and/or wants <strong>the</strong> worker to quit. In <strong>the</strong> remaining <strong>of</strong> this<br />

section we will describe a dynamic scheduling approach that is able to satisfy<br />

both, Alice and Bob. The key idea is that when Alice returns and starts using<br />

her machine Bob’s job is paused, transferred to ano<strong>the</strong>r workstation that is idle<br />

at that moment and continued from <strong>the</strong> point it was stopped. This concept<br />

holds also <strong>for</strong> multi-core systems where <strong>the</strong> load <strong>of</strong> one <strong>of</strong> <strong>the</strong> cores might get<br />

too high.<br />

Worker migration (also known as Process Migration) is <strong>the</strong> process <strong>of</strong> freezing<br />

a workers state, sending that state along with <strong>the</strong> worker to ano<strong>the</strong>r idle<br />

workstation, start that worker at that workstation initialized with <strong>the</strong> restored

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!