15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

architecture, the implementation, and the compilers be specifically designed to support the packaging<br />

of multiple independent operations into long instruction words.<br />

Proponents of the VLIW approach rightly contend that VLIW design reduces the control complexity<br />

within a processor; however, the corresponding drawbacks are a loss of program portability at the binary<br />

level and a lack of flexibility. With regard to portability, the control logic added by a superscalar processor<br />

is used to dynamically determine opportunities for parallel execution within a conventional instruction<br />

stream. Thus, superscalar processors dynamically schedule parallel execution of the instructions of existing<br />

executable program files, whereas recompilation into a static representation of parallel execution is a<br />

requirement for programs to run on VLIW processors. With regard to flexibility, superscalar processors<br />

can easily respond to dynamic events, such as cache misses. Dynamic events present a difficulty for VLIW<br />

designs. For example, early VLIW designs avoided data caches so that memory access time would be a<br />

known quantity for use in compiler scheduling.<br />

The recent introduction of EPIC (Explicitly Parallel Instruction Computing) architectures, such as the<br />

IA-64 architecture of Intel and Hewlett-Packard, is an attempt to gain the best of both approaches. Explicit<br />

dependence information is incorporated into the instruction formats to reduce the control logic complexity,<br />

and some scheduling of dynamic behavior is incorporated to provide flexibility.<br />

Instruction-Level Parallelism<br />

Superscalar processors attempt to identify and exploit parallelism in the instruction stream. That is,<br />

instructions that are independent should be executed in parallel. We briefly review the concept of dependencies.<br />

More details can be found in Mike Johnson’s text on superscalar microprocessor design [1].<br />

Dependencies<br />

Dependencies limit the parallelism between instructions because they must be enforced so that the results<br />

of program execution will be correct. Indeed, much of the control logic in a superscalar processor is<br />

devoted to identifying dependencies, so that execution will produce the same results as if the instruction<br />

stream was being executed on a purely sequential computer. Dependencies can be categorized in three ways.<br />

Data Dependencies<br />

Data dependencies exist between two instructions when the order between the two instructions must be<br />

maintained for execution to be correct. The most obvious data dependency is the true data dependency (or<br />

RAW: read-after-write dependency) in which the result of one instruction is used as an input operand for<br />

the second instruction. To preserve correctness, the first instruction must be executed prior to the second.<br />

The storage that is used first as a result and then as a source can be either a memory location or a CPU register.<br />

Two other cases arise when the second instruction writes to a storage location. An output dependency<br />

(or WAW: write-after-write dependency) occurs when both instructions write to the same storage. To<br />

preserve correctness, the result of the second instruction must be the final value of the storage. An antidependency<br />

(or WAR: write-after-read dependency) occurs when the first instruction reads an input<br />

operand from the storage location that will be written with the result of the second instruction. To preserve<br />

correctness, the first instruction must obtain its input operand before that value is overwritten by a new<br />

value from the second instruction. Both of these cases are called false data dependencies because they arise<br />

from the reuse of storage locations.<br />

Control Dependencies<br />

A control dependency occurs when an instruction depends on a conditional branch instruction. It is not<br />

known whether the instruction is to be executed or not until the branch is resolved. Thus, the branch<br />

must be executed prior to the instruction.<br />

Structural Dependencies<br />

A structural dependency occurs when two instructions need the same resource. If the resource is not<br />

duplicated, the instructions must execute sequentially, one after the other, rather than in parallel. The resource<br />

for which the instructions contend might be an adder, a bus, a register file port, or some other component.<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!