15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

In addition, if rename buffers are implemented separately from the architectural registers, the rename<br />

buffers need to be able to forward in each cycle as many result values to the architectural registers as the<br />

completion rate (retire rate) of the processor. As recent processors usually complete up to four instructions<br />

per cycle, this task increases the required number of read ports in the rename buffers typically by four.<br />

We note here that too many read ports in a register file may unduly increase the physical size of the<br />

datapath and as a result the cycle time. To avoid this problem, a few high performance processors (such<br />

as the Power2, Power3, and the Alpha 21264) implement two copies of particular register files. The Power2<br />

duplicates the FX-architectural register file, the Power3 doubles both the FX-rename and the FXarchitectural<br />

files, and the Alpha 21264 has two copies of the FX-merged architectural and rename<br />

register file. As a result fewer read ports are needed in each of the copies. For example, with two copies<br />

of the FX-merged register file, the Power3 needs only ten read ports in each file, instead of 16 read ports<br />

in one FX-register file. A drawback of this approach is, however, that a scheme is also required to keep<br />

both copies coherent.<br />

Now let us turn to the required number of write ports (input ports). Since rename buffers need to accept<br />

in each cycle all results produced by the execution units, they need to provide as many write ports as<br />

many results the execution units may produce per cycle. The FX-rename buffers receive results from the<br />

available integer-execution units and from the load/store units (fetched FX-data), whereas the FP-rename<br />

buffers hold the results of the FP-execution units and of the load/store units (fetched FP-data). Most<br />

results are single data items requiring one write port. However, there are a few exceptions. When execution<br />

units generate two data items they require two write ports as well; like the load/store units of PowerPC<br />

processors. After execution of the LOAD-WITH-UPDATE instruction, these units return both the fetched<br />

data value and the updated address value.<br />

Layout of the Register Mapping<br />

Overview<br />

Register mapping includes three tasks, as depicted in Fig. 6.11: (1) The processor needs to allocate rename<br />

buffers to the destination registers of the issued instructions; (2) it also must keep track of the actually<br />

valid mappings; and (3) it needs to deallocate no longer used rename buffers.<br />

Allocation Scheme of Rename Buffers<br />

As far as the allocation scheme of rename buffers is considered, processors usually allocate rename buffers<br />

to every issued instruction rather than only to those including a destination register in order to simplify<br />

logic. Although rename buffers are not needed until the results become available in the last execution<br />

cycle, rename buffers are typically allocated to the instructions as early as during instruction issue. This<br />

kind of register allocation leads to wasted rename register space. Delaying the allocation of rename buffers<br />

to the instructions until instructions finish 39 saves rename register space. Various schemes have been<br />

proposed for this, such as virtual renaming 39–42 and others. 43 In fact, a virtual allocation scheme has<br />

already been introduced into the Power3. 39<br />

Method of Keeping Track of Actual Mapping<br />

Two possibilities are available for keeping track of the actual mapping of particular architectural registers<br />

to allocated rename buffers. The processor can use a mapping table for this or can track the actual register<br />

FIGURE 6.11 Layout of the register mapping.<br />

© 2002 by CRC Press LLC<br />

Allocation scheme of<br />

the rename buffers<br />

Layout of the register mapping<br />

Method of keeping track<br />

of actual mappings<br />

Deallocation scheme<br />

of rename buffers

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!