21.01.2013 Views

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

Lecture Notes in Computer Science 4917

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

260 P. Akl and A. Moshovos<br />

Orig<strong>in</strong>al Code Renamed Code<br />

A Beq R1, R4, LABEL<br />

B Sub R2, R2, R3<br />

C Mult R3, R2, R1<br />

D Add R2, R2, R1<br />

E Beq R3, R4, LABEL2<br />

Architectural Dest<strong>in</strong>ation Register<br />

Previous RAT Map<br />

A Beq P1, P7, LABEL<br />

B Sub P3, P2, P4<br />

C Mult P5, P3, P1<br />

D Add P6, P3, P1<br />

E Beq P5, P7, LABEL2<br />

none<br />

none<br />

R2<br />

P2<br />

R3<br />

P4<br />

R2<br />

P3<br />

Orig<strong>in</strong>al RAT<br />

none<br />

none<br />

R1<br />

R2<br />

R3<br />

R4<br />

Reorder Buffer<br />

1 (A) 2 (B) 3 (C) 4(D) 5 (E) 6<br />

Fig. 1. Given an <strong>in</strong>struction to recover at, it is not always necessary to process all<br />

subsequent <strong>in</strong>structions recorded <strong>in</strong> the ROB<br />

a TROB with one GC performs as well as an implementation that uses four GCs.<br />

A s<strong>in</strong>gle GC RAT implementation is simpler than one that uses more GCs.<br />

The rest of this paper is organized as follows: Section 2 presents the TROB<br />

design. Section 3 reviews related work. Section 4 presents the experimental analysis<br />

of TROB. F<strong>in</strong>ally, Section 5 concludes this work. In this paper we restrict<br />

our attention to recovery from control flow mispeculation. However, the TROB<br />

can also be used to recover from other exceptions such as page faults. In the<br />

workloads we study these exceptions are very <strong>in</strong>frequent. We also focus on relatively<br />

large processors with up to 512-entry <strong>in</strong>struction w<strong>in</strong>dows. However, we<br />

do demonstrate that the TROB is beneficial even for smaller w<strong>in</strong>dow processors<br />

that are more representative of today’s processor designs. We note, however, that<br />

as architects are revisit<strong>in</strong>g the design of large w<strong>in</strong>dow processor simplify<strong>in</strong>g the<br />

underly<strong>in</strong>g structures and as multithread<strong>in</strong>g becomes commonplace it is likely<br />

that future processors will use larger w<strong>in</strong>dows.<br />

2 Turbo-ROB Recovery<br />

For clarity, we <strong>in</strong>itially restrict our attention to us<strong>in</strong>g the TROB to complement a<br />

ROB recovery mechanism. In Sections 2.4 and 2.5 we discuss how the TROB can<br />

be used without a ROB or with GCs respectively. The motivation for Turbo-ROB<br />

is that not all <strong>in</strong>structions <strong>in</strong>serted <strong>in</strong> the ROB are needed for every recovery.<br />

We motivate the Turbo-ROB design by first review<strong>in</strong>g how ROB recovery works.<br />

The ROB ma<strong>in</strong>ta<strong>in</strong>s a log of all changes <strong>in</strong> program order. Exist<strong>in</strong>g ROB designs<br />

allocate one entry per <strong>in</strong>struction <strong>in</strong> the w<strong>in</strong>dow. Each ROB entry conta<strong>in</strong>s<br />

sufficient <strong>in</strong>formation to reverse the effects of the correspond<strong>in</strong>g <strong>in</strong>struction. For<br />

the RAT it is sufficient to keep the architectural register name and the previous<br />

physical register it mapped to. On a mis-speculation, all ROB entries for the<br />

wrong path <strong>in</strong>structions are traversed <strong>in</strong> reverse program order. While the ROB<br />

design allows recovery at any <strong>in</strong>struction, given a specific <strong>in</strong>struction to recover<br />

at, not all ROB entries need to be traversed. Specifically, for every RAT entry,<br />

only the first correspond<strong>in</strong>g ROB entry after the mispredicted branch is needed.<br />

P1<br />

P2<br />

P4<br />

P7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!