01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

MLP-Aware Instruction Queue Resiz<strong>in</strong>g: The Key to Power-Efficient Performance 119<br />

4.3 Resiz<strong>in</strong>g Policy<br />

When we start track<strong>in</strong>g a superpath <strong>in</strong> dispatch, we check whether we have stored<br />

<strong>in</strong>formation about its behavior, <strong>in</strong>clud<strong>in</strong>g its MLP-distance <strong>in</strong>formation. If there is<br />

such <strong>in</strong>formation then we have an <strong>in</strong>dication about the m<strong>in</strong>imum IQ size which will<br />

not affect the MLP <strong>of</strong> this superpath. In many cases, however, there is no MLPdistance<br />

<strong>in</strong>formation available or the MLP-distance does not restrict IQ downsiz<strong>in</strong>g.<br />

The question is how much can we downsize the IQ is such cases? As expla<strong>in</strong>ed <strong>in</strong><br />

Section 3, an efficient IQ resiz<strong>in</strong>g scheme has to f<strong>in</strong>d the IQ size that m<strong>in</strong>imizes energy<br />

consumption without hurt<strong>in</strong>g performance. This means that besides not hurt<strong>in</strong>g<br />

MLP we must also protect ILP.<br />

This gap can be filled by any <strong>of</strong> the exist<strong>in</strong>g IQ resiz<strong>in</strong>g techniques. For example,<br />

the ILP-feedback approach <strong>of</strong> [2] can provide a target IQ size that does not hurt ILP<br />

while the MLP-aware approach judges whether this size hurts MLP and if so it overrides<br />

the decision. For the rest <strong>of</strong> this paper the ILP-feedback <strong>in</strong>formation will be provided<br />

by the decision mak<strong>in</strong>g mechanism <strong>in</strong> Folegnani and González [2]. This mechanism<br />

exam<strong>in</strong>es the participation <strong>of</strong> the youngest segment <strong>of</strong> the IQ to the IPC with<strong>in</strong> a<br />

specific number <strong>of</strong> cycles. If the contribution <strong>of</strong> the youngest part is not important,<br />

namely the number <strong>of</strong> <strong>in</strong>structions that issue from this segment is below a threshold,<br />

the IQ size is decreased. In our case, we deactivate the youngest segment when it<br />

empties.<br />

The ma<strong>in</strong> idea <strong>in</strong> the Folegnani and González work is that if a segment contributes<br />

very little to ILP, deactivat<strong>in</strong>g it would not harm performance. However, this holds<br />

only for ILP —not for MLP. In other words, even if the contribution <strong>of</strong> a segment <strong>in</strong><br />

issued <strong>in</strong>structions is very small, it can still have a significant impact on performance,<br />

if any <strong>of</strong> its issued <strong>in</strong>structions is <strong>in</strong>volved <strong>in</strong> MLP. It is exactly for such cases where<br />

MLP-awareness makes all the difference. Further, <strong>in</strong> the Folegnani and González<br />

work, once the IQ is downsized, there is no way to detect whether the situation<br />

changes —all we can see is the contribution to ILP <strong>of</strong> the active portion <strong>of</strong> the IQ.<br />

Thus, periodically, the IQ is brought back to its full size, and the downsiz<strong>in</strong>g process<br />

starts aga<strong>in</strong>. In our case, the existence <strong>of</strong> MLP automatically upsizes the IQ, to a size<br />

that does not harm MLP.<br />

5 Practical Implementation<br />

5.1 IQ Segmentation<br />

To allow dynamic variation <strong>of</strong> the IQ size, we divide the IQ <strong>in</strong> a number <strong>of</strong> <strong>in</strong>dependent<br />

parts referred as segments. For example, we use an 128-entry IQ partitioned <strong>in</strong>to<br />

eight, sixteen-entry segments. Bitl<strong>in</strong>e segmentation is used to implement the resiz<strong>in</strong>g<br />

<strong>of</strong> the structure [10]. The structure <strong>of</strong> the IQ follows the one <strong>in</strong> [2]. The IQ is a circular<br />

FIFO queue, with every new <strong>in</strong>struction <strong>in</strong>serted at the tail; retir<strong>in</strong>g <strong>in</strong>structions are<br />

removed from the head. The difference <strong>in</strong> our case, is that <strong>in</strong>dividual segments can be<br />

deactivated. A segment is deactivated if <strong>in</strong>structions from the youngest segment contribute<br />

less than threshold <strong>in</strong>structions <strong>in</strong> a quantum <strong>of</strong> time (1000 cycles) and a segment<br />

is reactivated every 5 quanta. This <strong>in</strong>evitably leads to constra<strong>in</strong>ts that have to be<br />

met dur<strong>in</strong>g the resiz<strong>in</strong>g process, similarly to those faced by Ponomarev et al. [11]:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!