14.06.2015 Views

Power ISA™ Version 2.03 - Power.org

Power ISA™ Version 2.03 - Power.org

Power ISA™ Version 2.03 - Power.org

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Version</strong> <strong>2.03</strong><br />

tor unit, meaning that it will stall vector instruction execution<br />

until all preceding vector instructions are<br />

complete and have updated the architectural machine<br />

state. This is permitted in order to simplify implementation<br />

of the sticky status bit (SAT) which would otherwise<br />

be difficult to implement in an out-of-order execution<br />

machine. The implication of this is that reading the<br />

VSCR can be much slower than typical Vector instructions,<br />

and therefore care must be taken in reading it, as<br />

advised in Section 5.5.1, to avoid performance problems.<br />

The mtvscr is context synchronizing. This implies that<br />

all Vector instructions logically preceding an mtvscr in<br />

the program flow will execute in the architectural context<br />

(NJ mode) that existed prior to completion of the<br />

mtvscr, and that all instructions logically following the<br />

mtvscr will execute in the new context (NJ mode)<br />

established by the mtvscr.<br />

5.3.3 VR Save Register<br />

The VR Save Register (VRSAVE) is a 32-bit register<br />

provided for application and operating system use.<br />

VRSAVE<br />

32 63<br />

Figure 59. VR Save Register<br />

Programming Note<br />

The VRSAVE register can be used to indicate<br />

which VRs are currently being used by a program.<br />

If this is done, the operating system could save<br />

only those VRs when an “interrupt” occurs (see<br />

Book III), and could restore only those VRs when<br />

resuming the interrupted program.<br />

If this approach is taken it must be applied rigorously;<br />

if a program fails to indicate that a given VR<br />

is in use, software errors may occur that will be difficult<br />

to detect and correct because they are timing-dependent.<br />

Some operating systems save and restore<br />

VRSAVE only for programs that also use other vector<br />

registers.<br />

5.4 Vector Storage Access Operations<br />

The Vector Storage Access instructions provide the<br />

means by which data can be copied from storage to a<br />

Vector Register or from a Vector Register to storage.<br />

Instructions are provided that access byte, halfword,<br />

word, and quadword storage operands. These instructions<br />

differ from the fixed-point and floating-point Storage<br />

Access instructions in that vector storage operands<br />

are assumed to be aligned, and vector storage<br />

accesses are performed as if the appropriate number<br />

of low-order bits of the specified effective address (EA)<br />

were zero. For example, the low-order bit of EA is<br />

ignored for halfword Vector Storage Access instructions,<br />

and the low-order four bits of EA are ignored for<br />

quadword Vector Storage Access instructions. The<br />

effect is to load or store the storage operand of the<br />

specified length that contains the byte addressed by<br />

EA.<br />

If a storage operand is unaligned, additional instructions<br />

must be used to ensure that the operand is correctly<br />

placed in a Vector Register or in storage.<br />

Instructions are provided that shift and merge the contents<br />

of two Vector Registers, such that an unaligned<br />

quadword storage operand can be copied between<br />

storage and the Vector Registers in a relatively efficient<br />

manner.<br />

As shown in Figure 56, the elements in Vector Registers<br />

are numbered; the high-order (or most significant)<br />

byte element is numbered 0 and the low-order (or least<br />

significant) byte element is numbered 15. The numbering<br />

affects the values that must be placed into the permute<br />

control vector for the Vector Permute instruction<br />

in order for that instruction to achieve the desired<br />

effects, as illustrated by the examples in the following<br />

subsections.<br />

A vector quadword Load instruction for which the effective<br />

address (EA) is quadword-aligned places the byte<br />

in storage addressed by EA into byte element 0 of the<br />

target Vector Register, the byte in storage addressed<br />

by EA+1 into byte element 1 of the target Vector Register,<br />

etc. Similarly, a vector quadword Store instruction<br />

for which the EA is quadword-aligned places the contents<br />

of byte element 0 of the source Vector Register<br />

into the byte in storage addressed by EA, the contents<br />

of byte element 1 of the source Vector Register into the<br />

byte in storage addressed by EA+1, etc.<br />

Figure 60 shows an aligned quadword in storage.<br />

Figure 61 shows the result of loading that quadword<br />

into a Vector Register or, equivalently, shows the contents<br />

that must be in a Vector Register if storing that<br />

Vector Register is to produce the storage contents<br />

shown in Figure 60.<br />

When an aligned byte, halfword, or word storage operand<br />

is loaded into a Vector Register, the element (byte,<br />

132<br />

<strong>Power</strong> ISA -- Book I

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!