29.10.2014 Views

Branch Prediction 1 Introduction 2 Questions

Branch Prediction 1 Introduction 2 Questions

Branch Prediction 1 Introduction 2 Questions

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CS510: Advanced Computer Architecture<br />

Programming Assignment 1<br />

<strong>Branch</strong> <strong>Prediction</strong><br />

Due: Oct. 31, 2007 (24:00)<br />

1 <strong>Introduction</strong><br />

For this assignment you use simplescalar (sim-outoder) to evaluate several branch prediction algorithms.<br />

• Simulator: sim-outorder<br />

• Benchmarks: basicmath, lame, stringsearch, pgp, FFT (use small size input)<br />

1.1 Parameter Description<br />

You can specify the branch predictor by -bpred options. You can find the options in the help message<br />

(sim-outorder -h). Followings are the options you need to do this assignment.<br />

• -bpred string (default: bimod)<br />

branch predictor type {nottaken|taken|perfect|bimod|2lev|comb}<br />

• -bpred:bimod int (default: 2048)<br />

bimodal predictor config (table size)<br />

• -bpred:2lev int int int {0|1} (default: 1 1024 8 0)<br />

2-level predictor config (level1 table size, level2 table size, history register width, xor)<br />

1.2 Sample Predictors<br />

These are configuration examples for the predictors in the lecture slide “Dynamic ILP(II)”<br />

• 2-bit scheme for BHT with 1K entries<br />

-bpred bimod<br />

-bpred:bimod 1024<br />

• (2,2) predictor using lower 4 bits of branch address<br />

-bpred 2lev<br />

-bpred:2lev 1 64 2 0 (1 2 (4+2) 2 0)<br />

• gshare predictor using lower 8 bits of branch address<br />

-bpred 2lev<br />

-bpred:2lev 1 256 8 1 (1 2 8 8 1)<br />

2 <strong>Questions</strong><br />

1. Program profile<br />

• Profile the ratio of conditional/unconditional branches to total instructions. (by dynamic count)


• Profile the breakdown of taken/not-taken branches by following categories. You may need to<br />

customize sim-outorder.<br />

taken<br />

not-taken<br />

2. Bimodal predictor<br />

B - forward B - backward BL<br />

• Show the performance variation according to the table size.<br />

(Draw a graph: Table size as x-axis, IPC as y-axis)<br />

• Compare the results.<br />

3. <strong>Branch</strong> Target Buffer<br />

If a branch predictor predicts a branch instruction will be taken, it looks up the target address of the<br />

branch in its branch target buffer(BTB). The size and associativity of BTB also affects IPC. You can<br />

specify BTB by [-bpred:btb set-size associativity (default: 512 4)]<br />

• Show the performance variation according to the BTB size. Use direct-mapped BTB and 512-<br />

entry bimodal predictor.<br />

• Complete the below table for each benchmark. Use 512-entry bimodal predictor.<br />

Sets Assoc. IPC Addr-rate Dir-rate<br />

512 1<br />

256 2<br />

128 4<br />

32 16<br />

8 64<br />

1 512<br />

• Explain the result.<br />

4. 2-level predictor: gshare<br />

• Evaluate IPC of the system with gshare(5-bit) predictor.<br />

• Find the GAg configuration (e.g., -bpred:2lev 1 2 k<br />

k 0) that yields the similar IPC.<br />

• Estimate and compare the resource cost of the above two predictors.<br />

3 Hand in<br />

When you finish this assignment, submit the report and source files to seonggun.kim[at]arcs.kaist.ac.kr.<br />

The quality of your discussion on each task is the most important factor in grading. Every result requires<br />

your proper explanation.<br />

Appendix A.<br />

1. Installing SimpleScalar-ARM<br />

• Download the SimpleScalar-ARM source tarball(simplesim-arm-0.2.tar.gz) from the following web<br />

page. http://www.simplescalar.com/v4test.html<br />

• Install the SimpleScalar-ARM in your workstation or server. For further information, please refer<br />

to the installation guide provided from http://www.simplescalar.com/docs.html


2. Installing ARM cross compiler<br />

• You will have to create your own ARM binary files during this assignment. In order to do this,<br />

you have to download and install an ARM cross compiler in your server.<br />

• For your convenience, visit the following web page and download handy script that performs the<br />

tedious process automatically. http://www.kegel.com/crosstool/<br />

3. Using the SimpleScalar simulator<br />

• Download the source code of MiBench from http://www.eecs.umich.edu/mibench/. It is an<br />

embedded benchmark suite made up of the several applications from six different program groups.<br />

In this assignment, use following benchmarks with small size input:<br />

⊲ basicmath (Automotive and Industrial group)<br />

⊲ lame (Consumer group)<br />

⊲ pgp (Security group)<br />

⊲ stringsearch (Office group)<br />

⊲ FFT (Telecomm group)<br />

• Build ARM executable binaries of MiBench with the optimization level 2(-O2). You should give<br />

-static (static linking) option to run the applications on SimpleScalar.<br />

Notice: if you have some problem with installing an ARM cross compiler, you may use<br />

the arm binaries deployed by the distributers of MiBench or by TA.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!