here - Stefan-Marr.de
here - Stefan-Marr.de
here - Stefan-Marr.de
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2. Context and Motivation<br />
driven to data driven computation. However, control driven programming,<br />
i. e., imperative programming remains important.<br />
Fork/Join utilizes the in<strong>here</strong>nt parallelism in data-oriented problems by using<br />
recursion to divi<strong>de</strong> the computation into steps that can be processed in<br />
parallel. It t<strong>here</strong>by makes an abstraction of the concrete data <strong>de</strong>pen<strong>de</strong>ncies by<br />
using recursive problem <strong>de</strong>composition and relying on explicit synchronization<br />
points when the result of a subproblem is required. While it is itself a<br />
control-driven approach, relying on control-flow-based primitives, it is typically<br />
used for data-parallel problems. However, it leaves it to the programmer<br />
to align the program with its data-<strong>de</strong>pen<strong>de</strong>ncies.<br />
Cilk [Blumofe et al., 1995] introduced fork/join as a novel combination of<br />
the classic recursive divi<strong>de</strong>-and-conquer style of programming with an efficient<br />
scheduling technique for parallel execution. Nowadays, it is wi<strong>de</strong>ly<br />
known as fork/join and available, e. g., for Java [Lea, 2000] and C/C++ with<br />
libraries such as Intel’s Threading Building Blocks 13 . Primitives of this parallel<br />
programming mo<strong>de</strong>l are the spawn, i. e., fork operation, which will<br />
result in a possibly parallel executing sub-computation, and the sync, i. e.,<br />
join-operation, which will block until the corresponding sub-computation is<br />
finished. Fork/join is a mo<strong>de</strong>l for parallel programming in shared memory<br />
environments. It enables <strong>de</strong>velopers to apply divi<strong>de</strong>-and-conquer in a parallel<br />
setting, however, it does not provi<strong>de</strong> mechanisms to handle for instance<br />
concurrency on global variables. Such mechanisms have been proposed [Frigo<br />
et al., 2009], but the original minimal mo<strong>de</strong>l focuses on the aspect of parallel<br />
execution.<br />
With work-stealing, Cilk also pioneered an efficient scheduling technique<br />
that makes parallel divi<strong>de</strong>-and-conquer algorithms practical for situations in<br />
which a static schedule leads to significant load imbalances and thus suboptimal<br />
performance.<br />
MapReduce Functional programming languages have introduced the notion<br />
of mapping a function on a sequence of values to produce a result sequence,<br />
which then can be reduced to some result value with another function.<br />
Based on this simple notion, distributed processing of data has become<br />
popular [Lämmel, 2008]. For companies like Google, Microsoft, or Yahoo, processing<br />
of large amounts of data became a performance challenge that required<br />
the use of large clusters and resilient programming mo<strong>de</strong>ls. The mo<strong>de</strong>l<br />
13 http://threadingbuildingblocks.org/<br />
34