01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16 G. Aşılıoğlu, E.M. Kaya, and O. Erg<strong>in</strong><br />

rename table, walk<strong>in</strong>g forward and backwards on the reorder buffer or wait<strong>in</strong>g until<br />

the commit time <strong>of</strong> the mispredicted branch <strong>in</strong>struction [2][9]. Each <strong>of</strong> these schemes<br />

have various pros and cons <strong>in</strong> terms <strong>of</strong> circuit complexity and misprediction penalty.<br />

In this paper we propose a new implementation <strong>of</strong> the rename table that allows fast<br />

recovery on a branch misprediction. We propose to use a FIFO structure for each<br />

architectural register and recover the tail po<strong>in</strong>ters when a branch misprediction occurs.<br />

Our design is less complex than tak<strong>in</strong>g a full checkpo<strong>in</strong>t <strong>of</strong> the rename table, and<br />

has smaller restoration latency when compared to the schemes that walk through the<br />

reorder buffer.<br />

2 Register Renam<strong>in</strong>g<br />

Register renam<strong>in</strong>g requires a new physical register to be assigned to each and every<br />

result produc<strong>in</strong>g <strong>in</strong>struction. Fig. 1 shows an example <strong>of</strong> how the register renam<strong>in</strong>g is<br />

performed. Note that R1 is overwritten by the second ADD <strong>in</strong>struction which is <strong>in</strong><br />

fact not dependent on the first ADD. This false dependency is removed by assign<strong>in</strong>g<br />

different physical registers to two ADD <strong>in</strong>structions.<br />

ADD R1, R2, R3<br />

SUB R2, R1, R2<br />

ADD R1, R4, R5<br />

ADD P13, P22, P31<br />

SUB P52, P13, P22<br />

ADD P95, P63, P44<br />

Rename Table<br />

R1 P95<br />

R2 P52<br />

R3 P21<br />

R4 P63<br />

R5 R44<br />

R6 P6<br />

R7 P7<br />

R8 P8<br />

Fig. 1. Example <strong>of</strong> Register Renam<strong>in</strong>g<br />

The second ADD <strong>in</strong>struction <strong>in</strong> Fig. 1 removes its write after write (WAW) and<br />

write after read (WAR) dependencies by writ<strong>in</strong>g its result to a separate physical register.<br />

However, this is not enough to ma<strong>in</strong>ta<strong>in</strong> precise execution; any younger <strong>in</strong>struction<br />

that needs to access R1 has to be aware <strong>of</strong> the fact that the value it needs to read<br />

resides <strong>in</strong> P95. The rename table is used for this purpose; the table <strong>in</strong>cludes an entry<br />

for each architectural register that holds the location <strong>of</strong> the most recent <strong>in</strong>stance <strong>of</strong> the<br />

register. Consequently, as the rename table holds the precise state <strong>of</strong> the processor,<br />

any fault on this table will result <strong>in</strong> a system crash.<br />

Ideally the rename table idea works perfectly. Each <strong>in</strong>struction that produces a result<br />

checks the availability <strong>of</strong> a physical register and allocates a register. Afterwards,<br />

the result produc<strong>in</strong>g <strong>in</strong>struction updates the rename table to <strong>in</strong>form the consumers that<br />

will later check this table to obta<strong>in</strong> the location <strong>of</strong> their architectural source registers.<br />

In superscalar mach<strong>in</strong>es, multiple <strong>in</strong>structions need to be renamed each cycle, which

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!