15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

are difficult (if not impossible) to statically predict—in part because they often depend on run-time inputs<br />

and behavior—that makes it extremely difficult for the compiler to statically prove whether or not<br />

potential threads are independent. Given the size and complexity of real non-numeric programs, parallelization<br />

appears to be an unrealistic goal if we stick to the parallel threads model. For such applications,<br />

we can use a different thread control flow model called sequential threads model. This model is closer<br />

to sequential control flow, and envisions a strict sequential ordering among the threads. That is, threads<br />

are extracted from sequential code and run in parallel, without violating the sequential program semantics.<br />

The control flow of the sequential code imposes an order on the threads and, therefore, we can use<br />

the terms predecessor and successor to qualify the relation between any given pair of threads. This means<br />

that inter-thread communication between any two threads (if any) is strictly in one direction, as dictated<br />

by the sequential thread ordering. Thus, no explicit synchronization operations are necessary, as the<br />

sequential semantics of the threads guarantee proper synchronization. This relaxation allows us to<br />

“parallelize” nonnumeric applications into threads without explicit synchronization, even if there is a<br />

potential inter-thread data dependence. Program correctness will not be violated if at run time there is a<br />

true data dependence between two threads. The purpose of identifying threads in such a model is to<br />

indicate that those threads are good candidates for parallel execution.<br />

Examples for multithreading proposals using sequential threads are the multiscalar model [8,9,30],<br />

the superthreading model [35], the trace processing model [28,36], and the dynamic multithreading<br />

model [1]. When using the sequential threads model, we can have threads that are nonspeculative from<br />

the control point of view, as well as threads that are speculative from the control point of view. The<br />

latter model is often called speculative multithreading (SpMT). This model is particularly important<br />

to deal with the complex control flow present in typical non-numeric programs. The multiscalar<br />

architecture [8,9,30] provided a complete design and evaluation of an SpMT architecture. Since<br />

then, many other proposals have extended the basic idea of SpMT [5,19,22,28,31,35,36]. One such<br />

extension is threaded multipath execution (TME) [38], in which the speculative threads are the<br />

alternate paths of hard-to-predict branches. A simple form of the SpMT model uses loop-based threads<br />

only [15,22].<br />

Inter-Thread Communication<br />

Inter-thread communication refers to passing data values between two or more threads. One of the key<br />

issues in a parallel programming model is the name levels at which sharing takes place between threads.<br />

Communication can take place at the level of register space, memory address space, and I/O space, with<br />

the registers being the level closest to the processor. If sharing can happen at a particular level, it can<br />

also happen at a more distant level. Parallel programming models can be classified into three categories,<br />

based on the sharing level that is closest to the processor:<br />

• Shared register model<br />

• Shared memory model<br />

• Message passing model<br />

In the shared register model, multiple threads share the same register space (or a portion of it). Interthread<br />

communication happens implicitly due to reads and writes to the shared registers (and to shared<br />

memory locations). This model typically uses fine-grain threads, because it is difficult to have long threads<br />

that communicate at the low level of registers, granularity is small. This class of parallel processors is<br />

fairly new and has evolved as an extension of single-threaded ILP processors. Examples are the multiscalar<br />

execution model [8,9,30], the trace execution model [28,36], and the dynamic multithreading model<br />

(DMT) [1].<br />

In the shared memory model, multiple threads share a common memory address space (or a portion<br />

of it). Inter-thread communication occurs implicitly as a result of conventional memory access instructions<br />

to shared memory locations. That is, writes to a logically shared address by one thread are visible<br />

to reads of the other threads, provided there are no other prior writes to that address as per the memory<br />

consistency/synchronization model.<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!