13.07.2015 Views

Cortex-A8 R2P2.pdf - ARM Information Center

Cortex-A8 R2P2.pdf - ARM Information Center

Cortex-A8 R2P2.pdf - ARM Information Center

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

NEON and VFP Programmer’s Model13.3 Short vectorsThe VFPv3 architecture supports execution of short vector instructions of up to eightoperations on single-precision data and up to four operations on double-precision data.The register file is especially suited for short vector operations. The foursingle-precision and eight double-precision register banks function as four hardwarecircular queues.13.3.1 About register banksAs Figure 13-2 on page 13-7 shows, the register file is divided into four banks witheight registers in each bank for single-precision instructions and eight banks with fourregisters per bank for double-precision instructions. CDP instructions access the banksin a circular manner. Load and store multiple instructions do not access the registers ina circular manner but treat the register file as a linearly ordered structure.The VFPv3 architecture adds 16 double-precision registers, making use of theadditional register addressing bits currently used to specify single-precision registers.The first 16 registers, D0 through D15, in the NEON register file provides the samefunctionality as the register file defined in the VFPv2 architecture. VFPv3 adds 16 newdouble-precision registers, D16 through D31, which provides a second set of 16double-precision registers. These registers behave in vector mode in an identicalmanner to the lower 16 registers, with bank 4 specified as registers D16-D19, bank 5specified as registers D20-D23, bank 6 specified as registers D24-D27, and bank 7specified as D28-D31. Bank 4 of the second set of registers has the same characteristicswhen used in short vector instructions as bank 0 of the first set of registers.Short vector operations on double-precision data support vector lengths of two throughfour iterations. The additional registers provides the capability to double-bufferdouble-precision operations in a similar way as is available for single-precisionoperations.See the <strong>ARM</strong> Architecture Reference Manual for more information on VFP addressingmodes.13-6 Copyright © 2006-2008 <strong>ARM</strong> Limited. All rights reserved. <strong>ARM</strong> DDI 0344E

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!