01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Complexity-Effective Rename Table Design for Rapid Speculation Recovery 19<br />

Fig. 3. Amount <strong>of</strong> register usage <strong>in</strong> the Spec2000 benchmark suite, light area represents maximum<br />

usage, dark area represents average usage<br />

Conditions for vacat<strong>in</strong>g entries <strong>in</strong>side the FIFO queues vary accord<strong>in</strong>g to the implementation<br />

<strong>of</strong> register renam<strong>in</strong>g <strong>in</strong>side the processor. In P6 architecture the reorder<br />

buffer entries also serve as physical registers and a separate architectural register file<br />

exists to hold the architectural state. Therefore the head po<strong>in</strong>ter is updated when an<br />

<strong>in</strong>struction that targets the correspond<strong>in</strong>g architectural register commits. On the other<br />

hand, <strong>in</strong> architectures, which use a unified register file that holds both the physical<br />

and architectural state, such as the Intel’s Pentium 4 [3], Alpha 21264 [5] and MIPS<br />

R10000 [7], a physical register is released only when an <strong>in</strong>struction that renames the<br />

same architectural register commits. In the proposed rename structure, the head<br />

po<strong>in</strong>ter <strong>of</strong> the correspond<strong>in</strong>g architectural register is updated (<strong>in</strong>cremented by one<br />

unless the po<strong>in</strong>ter is at the end <strong>of</strong> the buffer) whenever a physical register that holds<br />

an <strong>in</strong>stance <strong>of</strong> the architectural register is released.<br />

S<strong>in</strong>ce each architectural register has its own circular FIFO buffer <strong>in</strong> the proposed<br />

structure, the number <strong>of</strong> entries <strong>in</strong> each FIFO queue may vary. The number <strong>of</strong> entries<br />

for each architectural register can be determ<strong>in</strong>ed by observ<strong>in</strong>g the common behavior<br />

<strong>of</strong> programs and compilers. If an <strong>in</strong>struction targets an architectural register and the<br />

correspond<strong>in</strong>g FIFO queue does not conta<strong>in</strong> any available entries, the pipel<strong>in</strong>e stalls<br />

and the frontend waits until an entry is available for the <strong>in</strong>struction. In order to m<strong>in</strong>imize<br />

the performance degradation due to the proposed rename structure, the FIFO<br />

queues have to be sized appropriately, so that the pipel<strong>in</strong>e does not get stalled frequently.<br />

Fig. 3 shows the average and maximum number <strong>of</strong> renamed <strong>in</strong>stances <strong>of</strong> each<br />

general purpose architectural register <strong>in</strong> the simulated x86 architecture. As the results<br />

reveal, some <strong>of</strong> the registers are employed more than the others. For example, register<br />

rax has the most concurrent <strong>in</strong>stances on average. This result shows that a larger FIFO<br />

queue has to be used for rax.<br />

The number <strong>of</strong> entries <strong>in</strong> each FIFO queue can be at most equal to (the number <strong>of</strong><br />

physical registers – the number <strong>of</strong> architectural registers). This is because <strong>of</strong> the fact<br />

that each architectural register has to have a physical location and the same physical<br />

register cannot be assigned to two different architectural registers at the same time.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!