15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FIGURE 6.17 A schematic for a bimodal predictor.<br />

“Baddr” is the branch address or PC, which is used to index<br />

the PHT (pattern history table), select the corresponding<br />

two-bit counter, and make a prediction of taken or nottaken.<br />

the fetch block after the branch are still valid and can still be passed on to decode. The second solution<br />

is for the branch predictor to just predict fetch-block successors instead of specific branches. In this case,<br />

the predictor simply predicts whether the next fetch block will be sequential (not-taken) or non-sequential<br />

(taken, in which case the target supplied by the BTAC is used). This is slightly better than the first choice,<br />

because it eliminates the need for pre-decode bits and can fetch past more than one not-taken branch<br />

in a fetch block. It does require the decode stage to identify how each branch in a fetch block was implicitly<br />

predicted. The third solution is for the BTAC and branch predictor to be indexed with the address of<br />

every instruction in the fetch block. Hits in the BTAC indicate which instructions are branches, and only<br />

the corresponding direction predictions are then used. The problem with this approach is that it requires<br />

as many ports into the BTAC and branch-prediction structures as there are instructions in the fetch<br />

block. These are the basic choices, although many variations and improvements have been proposed,<br />

e.g., [24,28–30].<br />

Bimodal Prediction<br />

The simplest dynamic technique, introduced by Smith [17], is to maintain a small, on-chip memory<br />

with a table of saturating counters that is indexed by branch address. The saturating counters—typically<br />

two bits each—simply remember the predominant direction of previous outcomes for that branch. A<br />

schematic for a bimodal predictor appears in Fig. 6.17. As mentioned, the table—usually called the pattern<br />

history table or PHT—although logically a distinct entity, might actually be implemented as a unified<br />

structure with the BTAC. This prediction scheme goes by different names, often simply “two-bit prediction,”<br />

but recent literature has often referred to it as “bimodal” prediction to distinguish it from other<br />

more sophisticated schemes that also use two-bit saturating counters.<br />

Each time a branch resolves, its corresponding counter is incremented if the branch was taken, and<br />

decremented if not. Incrementing or decrementing has no effect if the counter is already at its maximum<br />

or minimum value, hence the term “saturating” counter and the name “bimodal.” In the simplest case of<br />

a one-bit counter, the only possibilities are values of 0 and 1 and the predictor simply remembers the last<br />

outcome for each branch. In the case of two-bit counters, values of 00 and 01 correspond to strongly nottaken<br />

and weakly not-taken, and values of 10 and 11 corresponding to weakly taken and strongly taken.<br />

Two-bit counters give better performance because they exhibit some hysteresis that makes them less<br />

sensitive to infrequent occurrences of outcomes in the non-dominant direction. A state-transition diagram<br />

for the most common two-bit counter configuration appears in Fig. 6.18. Other configurations [18,31]<br />

are possible, however, for example, regardless of its current state, the counter might reset to 00 on a nottaken<br />

branch.<br />

As an example of how two-bit counters improve over one-bit counters, recall that a loop branch will<br />

normally be taken. When the loop exits, a one-bit counter will only remember that most recent direction<br />

(not taken), even though the predominant direction is “taken.” When this same loop is encountered<br />

again, and the loop branch will once again be taken until the loop exits, the first prediction with a onebit<br />

counter will be “not taken.” A two-bit counter, on the other hand, only changes its state from 11 to<br />

10 upon loop exit, and still predicts taken when it returns to the loop, thus eliminating a misprediction<br />

compared to the one-bit counter.<br />

© 2002 by CRC Press LLC<br />

baddr<br />

PHT<br />

T/NT

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!