05.08.2014 Views

here - Stefan-Marr.de

here - Stefan-Marr.de

here - Stefan-Marr.de

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2. Context and Motivation<br />

driven to data driven computation. However, control driven programming,<br />

i. e., imperative programming remains important.<br />

Fork/Join utilizes the in<strong>here</strong>nt parallelism in data-oriented problems by using<br />

recursion to divi<strong>de</strong> the computation into steps that can be processed in<br />

parallel. It t<strong>here</strong>by makes an abstraction of the concrete data <strong>de</strong>pen<strong>de</strong>ncies by<br />

using recursive problem <strong>de</strong>composition and relying on explicit synchronization<br />

points when the result of a subproblem is required. While it is itself a<br />

control-driven approach, relying on control-flow-based primitives, it is typically<br />

used for data-parallel problems. However, it leaves it to the programmer<br />

to align the program with its data-<strong>de</strong>pen<strong>de</strong>ncies.<br />

Cilk [Blumofe et al., 1995] introduced fork/join as a novel combination of<br />

the classic recursive divi<strong>de</strong>-and-conquer style of programming with an efficient<br />

scheduling technique for parallel execution. Nowadays, it is wi<strong>de</strong>ly<br />

known as fork/join and available, e. g., for Java [Lea, 2000] and C/C++ with<br />

libraries such as Intel’s Threading Building Blocks 13 . Primitives of this parallel<br />

programming mo<strong>de</strong>l are the spawn, i. e., fork operation, which will<br />

result in a possibly parallel executing sub-computation, and the sync, i. e.,<br />

join-operation, which will block until the corresponding sub-computation is<br />

finished. Fork/join is a mo<strong>de</strong>l for parallel programming in shared memory<br />

environments. It enables <strong>de</strong>velopers to apply divi<strong>de</strong>-and-conquer in a parallel<br />

setting, however, it does not provi<strong>de</strong> mechanisms to handle for instance<br />

concurrency on global variables. Such mechanisms have been proposed [Frigo<br />

et al., 2009], but the original minimal mo<strong>de</strong>l focuses on the aspect of parallel<br />

execution.<br />

With work-stealing, Cilk also pioneered an efficient scheduling technique<br />

that makes parallel divi<strong>de</strong>-and-conquer algorithms practical for situations in<br />

which a static schedule leads to significant load imbalances and thus suboptimal<br />

performance.<br />

MapReduce Functional programming languages have introduced the notion<br />

of mapping a function on a sequence of values to produce a result sequence,<br />

which then can be reduced to some result value with another function.<br />

Based on this simple notion, distributed processing of data has become<br />

popular [Lämmel, 2008]. For companies like Google, Microsoft, or Yahoo, processing<br />

of large amounts of data became a performance challenge that required<br />

the use of large clusters and resilient programming mo<strong>de</strong>ls. The mo<strong>de</strong>l<br />

13 http://threadingbuildingblocks.org/<br />

34

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!