15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FIGURE 6.23 Branch prediction accuracies for 8-kbit predictors for the SPECint95 benchmarks. “Bim” is the<br />

bimodal predictor, and “Hyb” is the hybrid predictor. Specific configurations appear in Table 6.3.<br />

FIGURE 6.24 Branch prediction accuracies for 64-kbit predictors for the SPECint95 benchmarks. Specific configurations<br />

appear in Table 6.4.<br />

for five of the eight benchmarks, but performs terribly for m88ksim. (But keep in mind that BTFNT would<br />

look better for floating-point codes, which are heavily loop-oriented.) Always-taken is the best static scheme<br />

for m88ksim and ijpeg, but is the worst for three other benchmarks. This variability of the best scheme<br />

among different benchmarks makes it difficult to choose one static scheme to implement.<br />

Among dynamic schemes, bimodal is worse than the more sophisticated dynamic schemes for all<br />

except go and vortex. Unlike the other schemes, however, bimodal is less sensitive to predictor size, with<br />

a mean difference between the 8-kbit and 64-kbit bimodal predictors of only 0.55%. The reason for this<br />

is that the bimodal predictor allocates only one entry to each branch site (i.e., a static branch location<br />

in the program), no matter how often that branch executes or how varied its behavior. Most of the<br />

programs have a fairly small number of branch sites, and of course the property of locality means that<br />

only a subset of these are active at any one time. A 4 K-entry (8 kbit) table is, therefore, sufficient to<br />

capture most of the static branch locations, and making the table larger has little effect.<br />

Among two-level predictors, GAs is better than PAs about as often as PAs is better than GAs. As with<br />

the static schemes, this variability of the best scheme among benchmarks makes it difficult to choose<br />

which scheme to implement. This is strong motivation for use of the hybrid predictor, especially given<br />

the observation [34,37] that many programs have some branches that are much better predicted using<br />

global history, while other branches are much better predicted using local history. GAs, PAs, and hybrid,<br />

however, are all the more sensitive to predictor size than bimodal is. Go and gcc are particularly sensitive<br />

© 2002 by CRC Press LLC<br />

Prediction Accuracy<br />

Prediction Accuracy<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

1<br />

0.9<br />

0.8<br />

0.7<br />

0.6<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

go m88 gcc compress xlisp ijpeg perl vortex<br />

go m88 gcc compress xlisp ijpeg perl vortex<br />

T<br />

NT<br />

BTFNT<br />

Bim<br />

GAs<br />

PAs<br />

Hyb<br />

T<br />

NT<br />

BTFNT<br />

Bim<br />

GAs<br />

PAs<br />

Hyb

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!