Branch Prediction 1 Introduction 2 Questions
Branch Prediction 1 Introduction 2 Questions
Branch Prediction 1 Introduction 2 Questions
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
CS510: Advanced Computer Architecture<br />
Programming Assignment 1<br />
<strong>Branch</strong> <strong>Prediction</strong><br />
Due: Oct. 31, 2007 (24:00)<br />
1 <strong>Introduction</strong><br />
For this assignment you use simplescalar (sim-outoder) to evaluate several branch prediction algorithms.<br />
• Simulator: sim-outorder<br />
• Benchmarks: basicmath, lame, stringsearch, pgp, FFT (use small size input)<br />
1.1 Parameter Description<br />
You can specify the branch predictor by -bpred options. You can find the options in the help message<br />
(sim-outorder -h). Followings are the options you need to do this assignment.<br />
• -bpred string (default: bimod)<br />
branch predictor type {nottaken|taken|perfect|bimod|2lev|comb}<br />
• -bpred:bimod int (default: 2048)<br />
bimodal predictor config (table size)<br />
• -bpred:2lev int int int {0|1} (default: 1 1024 8 0)<br />
2-level predictor config (level1 table size, level2 table size, history register width, xor)<br />
1.2 Sample Predictors<br />
These are configuration examples for the predictors in the lecture slide “Dynamic ILP(II)”<br />
• 2-bit scheme for BHT with 1K entries<br />
-bpred bimod<br />
-bpred:bimod 1024<br />
• (2,2) predictor using lower 4 bits of branch address<br />
-bpred 2lev<br />
-bpred:2lev 1 64 2 0 (1 2 (4+2) 2 0)<br />
• gshare predictor using lower 8 bits of branch address<br />
-bpred 2lev<br />
-bpred:2lev 1 256 8 1 (1 2 8 8 1)<br />
2 <strong>Questions</strong><br />
1. Program profile<br />
• Profile the ratio of conditional/unconditional branches to total instructions. (by dynamic count)
• Profile the breakdown of taken/not-taken branches by following categories. You may need to<br />
customize sim-outorder.<br />
taken<br />
not-taken<br />
2. Bimodal predictor<br />
B - forward B - backward BL<br />
• Show the performance variation according to the table size.<br />
(Draw a graph: Table size as x-axis, IPC as y-axis)<br />
• Compare the results.<br />
3. <strong>Branch</strong> Target Buffer<br />
If a branch predictor predicts a branch instruction will be taken, it looks up the target address of the<br />
branch in its branch target buffer(BTB). The size and associativity of BTB also affects IPC. You can<br />
specify BTB by [-bpred:btb set-size associativity (default: 512 4)]<br />
• Show the performance variation according to the BTB size. Use direct-mapped BTB and 512-<br />
entry bimodal predictor.<br />
• Complete the below table for each benchmark. Use 512-entry bimodal predictor.<br />
Sets Assoc. IPC Addr-rate Dir-rate<br />
512 1<br />
256 2<br />
128 4<br />
32 16<br />
8 64<br />
1 512<br />
• Explain the result.<br />
4. 2-level predictor: gshare<br />
• Evaluate IPC of the system with gshare(5-bit) predictor.<br />
• Find the GAg configuration (e.g., -bpred:2lev 1 2 k<br />
k 0) that yields the similar IPC.<br />
• Estimate and compare the resource cost of the above two predictors.<br />
3 Hand in<br />
When you finish this assignment, submit the report and source files to seonggun.kim[at]arcs.kaist.ac.kr.<br />
The quality of your discussion on each task is the most important factor in grading. Every result requires<br />
your proper explanation.<br />
Appendix A.<br />
1. Installing SimpleScalar-ARM<br />
• Download the SimpleScalar-ARM source tarball(simplesim-arm-0.2.tar.gz) from the following web<br />
page. http://www.simplescalar.com/v4test.html<br />
• Install the SimpleScalar-ARM in your workstation or server. For further information, please refer<br />
to the installation guide provided from http://www.simplescalar.com/docs.html
2. Installing ARM cross compiler<br />
• You will have to create your own ARM binary files during this assignment. In order to do this,<br />
you have to download and install an ARM cross compiler in your server.<br />
• For your convenience, visit the following web page and download handy script that performs the<br />
tedious process automatically. http://www.kegel.com/crosstool/<br />
3. Using the SimpleScalar simulator<br />
• Download the source code of MiBench from http://www.eecs.umich.edu/mibench/. It is an<br />
embedded benchmark suite made up of the several applications from six different program groups.<br />
In this assignment, use following benchmarks with small size input:<br />
⊲ basicmath (Automotive and Industrial group)<br />
⊲ lame (Consumer group)<br />
⊲ pgp (Security group)<br />
⊲ stringsearch (Office group)<br />
⊲ FFT (Telecomm group)<br />
• Build ARM executable binaries of MiBench with the optimization level 2(-O2). You should give<br />
-static (static linking) option to run the applications on SimpleScalar.<br />
Notice: if you have some problem with installing an ARM cross compiler, you may use<br />
the arm binaries deployed by the distributers of MiBench or by TA.