01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

18 G. Aşılıoğlu, E.M. Kaya, and O. Erg<strong>in</strong><br />

holds the previous mapp<strong>in</strong>g <strong>of</strong> the correspond<strong>in</strong>g architectural register <strong>in</strong> its reorder<br />

buffer entry.<br />

4. Checkpo<strong>in</strong>t<strong>in</strong>g: The circuitry <strong>of</strong> the rename table is modified to <strong>in</strong>clude some<br />

shadow copies <strong>of</strong> the rename table. Whenever a branch <strong>in</strong>struction passes the rename<br />

stage, a copy <strong>of</strong> the rename table is created. If the branch is mispredicted, the<br />

correspond<strong>in</strong>g checkpo<strong>in</strong>t is copied to the speculative rename table. This scheme<br />

enables the recovery <strong>of</strong> the rename table state <strong>in</strong> a s<strong>in</strong>gle cycle. However the circuit<br />

level complexity limits the number <strong>of</strong> checkpo<strong>in</strong>ts that can be implemented given a<br />

clock frequency. Also, the limited number <strong>of</strong> checkpo<strong>in</strong>ts limits the number <strong>of</strong><br />

branch <strong>in</strong>structions that can be <strong>in</strong>side the reorder buffer <strong>of</strong> the processor s<strong>in</strong>ce a<br />

branch cannot proceed if a free checkpo<strong>in</strong>t is not available when it arrives at the<br />

rename stage.<br />

It is previously shown that walk<strong>in</strong>g backwards on the reorder buffer outperforms<br />

the other schemes (except for the checkpo<strong>in</strong>t<strong>in</strong>g scheme which has its own circuit<br />

level difficulties) <strong>in</strong> terms <strong>of</strong> <strong>in</strong>structions per cycle for spec 2000 benchmarks [2].<br />

4 Proposed Rename Table Structure<br />

Dur<strong>in</strong>g a processor design process, it is desirable to keep the number <strong>of</strong> cycles<br />

required to recover the rename table at m<strong>in</strong>imum with m<strong>in</strong>imum circuit level complexity.<br />

In other words, a processor designer would want the performance <strong>of</strong> checkpo<strong>in</strong>t<strong>in</strong>g<br />

with a very simple circuit. For this purpose, we propose a new rename table<br />

structure that is capable <strong>of</strong> stor<strong>in</strong>g the history <strong>of</strong> physical register assignments for<br />

each architectural register.<br />

Fig. 2 shows the proposed rename table structure. A separate circular FIFO queue<br />

is used for each architectural register to store the physical register assignments.<br />

Whenever a new <strong>in</strong>struction that targets an architectural register as dest<strong>in</strong>ation arrives<br />

at the rename stage and allocates an available free physical register, the tag <strong>of</strong> the<br />

allocated register is <strong>in</strong>serted <strong>in</strong>to the correspond<strong>in</strong>g FIFO queue by us<strong>in</strong>g the correspond<strong>in</strong>g<br />

tail po<strong>in</strong>ter. The tail po<strong>in</strong>ter always shows the most recent <strong>in</strong>stance <strong>of</strong> the<br />

correspond<strong>in</strong>g architectural register. Subsequent dependent <strong>in</strong>structions always read<br />

their source locations from the registers po<strong>in</strong>ted by the tail po<strong>in</strong>ters.<br />

Fig. 2. Proposed Rename Table Structure

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!