17.11.2012 Views

Soft-Core Processor Design - CiteSeer

Soft-Core Processor Design - CiteSeer

Soft-Core Processor Design - CiteSeer

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

underflow and overflow exceptions always occur in pairs in the normal program operation,<br />

because the procedure whose call caused the window underflow eventually returns and causes the<br />

window overflow exception. We will refer to the register window underflow/overflow pair as a<br />

register window exception in the rest of this section.<br />

The register file size changes are simulated by writing appropriate values to the WVALID<br />

register. Since the program only uses registers between LO_LIMIT and HI_LIMIT, and the CWP<br />

Manager saves and reloads only these registers, the run times will be equivalent to the system<br />

with the corresponding register file size. The physical register file-size is 512 registers for all<br />

experiments, which allows the use of 31 register windows. The number of register windows<br />

available to the program varies from 1 to 29, because one register window is used by the GERMS<br />

monitor, and one is reserved for the interrupt handler. We also measured the performance of the<br />

benchmarks when the register windows are not used, but only one register window is visible to<br />

the program. This is achieved by compiling a program with the mflat compiler option, which<br />

instructs the compiler not to use the SAVE and RESTORE instructions, but to save the registers<br />

altered by a procedure on the stack on entrance to the procedure, and restore the registers from the<br />

stack when returning from the procedure [61]. This is different from the system with the CWP<br />

Manager and only one register window available. The CWP Manager always saves the whole<br />

register window on the stack, while the code generated with the mflat compiler option saves only<br />

the registers that are actually altered by the procedure. Benchmarks CRC32, and Qsort could not<br />

be compiled with the mflat option because the compiler terminated with the “Internal Compiler<br />

Error” message.<br />

Figure 5.4 shows how the performance of the programs running on the ONCHIP system<br />

depends on the number of register windows available. Only toy benchmarks and two algorithms<br />

from the Bitcount benchmark with large datasets are shown. The other benchmarks are not<br />

presented, because they show trends similar to the ones presented in the figure. All values are<br />

program run times relative to the run time of the program when 29 register windows are<br />

available. The performance past the 12 register windows is not shown, because it remains close to<br />

1. The point on the horizontal axis with zero available register windows corresponds to the<br />

programs compiled with the mflat compiler option.<br />

There are three distinct trends visible in Figure 5.4. The performance of the Multiply<br />

benchmark does not depend on the register file size, because it does not contain calls to<br />

procedures that use the SAVE and RESTORE instructions. The Bitcount algorithm Non-recursive<br />

by bytes has only one procedure call from the main program to a bit counting procedure.<br />

Therefore, it experiences a slowdown when only one register window is available, because the<br />

57

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!