15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FIGURE 39.23 In the packed multiply high and accumulate instruction in AltiVec, only the high-order<br />

bits of the intermediate products are used in the addition.<br />

packed register by the fractional constant 0.011 2 (=0.375) can be performed by using just two packed<br />

arithmetic shift right and add instructions.<br />

Initial values: C = 0.375 = 0.011 2 and R a = [1|2|3|4] = [0001|0010|0011|0100] 2<br />

Instruction Operation Result<br />

Arithmetic shift right 3 bit and add R b,R a,0 R b = R a >> 2 + 0 R b = [0.125|0.25|0.375|0.5]<br />

Arithmetic shift right 2 bit and add R b,R a,R b R b = R a >> 2 + R b R b = [0.375|0.75|1.125|1.5]<br />

Only two single-cycle instructions are required to perform the multiplication of four subwords by a<br />

constant, in this example. This is equivalent to an effective rate of two multiplications per cycle. Without<br />

subword parallelism, the same operations would take at least four integer multiply instructions.<br />

Furthermore, the packed shift and add instructions use a simple ALU with a small preshifter, whereas<br />

the integer multiply instructions need a more complex multiplier functional unit. In addition, each<br />

multiplication operation takes at least three cycles of latency compared to one cycle of latency for a<br />

preshift and add operation. Hence, for this example, the speedup for multiplying four subwords by a<br />

constant is six times faster (4 × 3/2), comparing implementations with one subword multiplier versus<br />

one partitionable ALU with preshifter.<br />

MAX-2 in PA-RISC and IA-64 are the only multimedia ISAs surveyed that have these efficient packed<br />

shift left and add instructions and packed shift right and add instructions. The preshift<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!