13.11.2016 Views

OpenMP Application Programming Interface Examples

2fZ58Wr

2fZ58Wr

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

1<br />

CHAPTER 1<br />

2<br />

Parallel Execution<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

11<br />

12<br />

13<br />

14<br />

15<br />

16<br />

17<br />

18<br />

19<br />

20<br />

21<br />

22<br />

23<br />

24<br />

25<br />

26<br />

27<br />

A single thread, the initial thread, begins sequential execution of an <strong>OpenMP</strong> enabled program, as<br />

if the whole program is in an implicit parallel region consisting of an implicit task executed by the<br />

initial thread.<br />

A parallel construct encloses code, forming a parallel region. An initial thread encountering a<br />

parallel region forks (creates) a team of threads at the beginning of the parallel region, and<br />

joins them (removes from execution) at the end of the region. The initial thread becomes the master<br />

thread of the team in a parallel region with a thread number equal to zero, the other threads are<br />

numbered from 1 to number of threads minus 1. A team may be comprised of just a single thread.<br />

Each thread of a team is assigned an implicit task consisting of code within the parallel region. The<br />

task that creates a parallel region is suspended while the tasks of the team are executed. A thread is<br />

tied to its task; that is, only the thread assigned to the task can execute that task. After completion<br />

of the parallel region, the master thread resumes execution of the generating task.<br />

Any task within a parallel region is allowed to encounter another parallel region to form a<br />

nested parallel region. The parallelism of a nested parallel region (whether it forks<br />

additional threads, or is executed serially by the encountering task) can be controlled by the<br />

OMP_NESTED environment variable or the omp_set_nested() API routine with arguments<br />

indicating true or false.<br />

The number of threads of a parallel region can be set by the OMP_NUM_THREADS<br />

environment variable, the omp_set_num_threads() routine, or on the parallel directive<br />

with the num_threads clause. The routine overrides the environment variable, and the clause<br />

overrides all. Use the OMP_DYNAMIC or the omp_set_dynamic() function to specify that the<br />

<strong>OpenMP</strong> implementation dynamically adjust the number of threads for parallel regions. The<br />

default setting for dynamic adjustment is implementation defined. When dynamic adjustment is on<br />

and the number of threads is specified, the number of threads becomes an upper limit for the<br />

number of threads to be provided by the <strong>OpenMP</strong> runtime.<br />

3

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!