13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

OPTIMIZING FOR SIMD FLOATING-POINT APPLICATIONSX3 X2 X1 X0Y3 Y2 Y1 Y0OPOPOPOPX3 OP Y3 X2 OP Y2 X 1OP Y1 X0 OP Y0Figure 6-1. Homogeneous Operation on Parallel Data ElementsAoS data structures are often used in 3D geometry computations. SIMD technologycan be applied to AoS data structure using a horizontal computation model. Thismeans that X, Y, Z, <strong>and</strong> W components of a single vertex structure (that is, of a singlevector simultaneously referred to as an XYZ data representation, see Figure 6-2) arecomputed in parallel, <strong>and</strong> the array is updated one vertex at a time.When data structures are organized for the horizontal computation model, sometimesthe availability of homogeneous arithmetic operations in SSE/SSE2 may causeinefficiency or require additional intermediate movement between data elements.X Y Z WFigure 6-2. Horizontal Computation ModelAlternatively, the data structure can be organized in the SoA format. The SoA datastructure enables a vertical computation technique, <strong>and</strong> is recommended over horizontalcomputation for many applications, for the following reasons:• When computing on a single vector (XYZ), it is common to use only a subset ofthe vector components; for example, in 3D graphics the W component issometimes ignored. This means that for single-vector operations, 1 of 4computation slots is not being utilized. This typically results in a 25% reduction ofpeak efficiency.6-4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!