13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING CACHE USAGEA characteristic feature of both single-pass <strong>and</strong> multi-pass execution is that a specifictrade-off exists depending on an algorithm’s implementation <strong>and</strong> use of a single-passor multiple-pass execution. See Figure 9-9.Multi-pass execution is often easier to use when implementing a general purposeAPI, where the choice of code paths that can be taken depends on the specific combinationof features selected by the application (for example, for 3D graphics, thismight include the type of vertex primitives used <strong>and</strong> the number <strong>and</strong> type of lightsources).With such a broad range of permutations possible, a single-pass approach would becomplicated, in terms of code size <strong>and</strong> validation. In such cases, each possiblepermutation would require a separate code sequence. For example, an object withfeatures A, B, C, D can have a subset of features enabled, say, A, B, D. This stagewould use one code path; another combination of enabled features would have adifferent code path. It makes more sense to perform each pipeline stage as a separatepass, with conditional clauses to select different features that are implementedwithin each stage. By using strip-mining, the number of vertices processed by eachstage (for example, the batch size) can be selected to ensure that the batch stayswithin the processor caches through all passes. An intermediate cached buffer isused to pass the batch of vertices from one stage or pass to the next one.Single-pass execution can be better suited to applications which limit the number offeatures that may be used at a given time. A single-pass approach can reduce theamount of data copying that can occur with a multi-pass engine. See Figure 9-9.9-28

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!