13.07.2015 Views

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

Intel® 64 and IA-32 Architectures Optimization Reference Manual

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

OPTIMIZING CACHE USAGE9.5.5 CLFLUSH InstructionThe CLFLUSH instruction invalidates the cache line associated with the linear addressthat contain the byte address of the memory location, in all levels of the processorcache hierarchy (data <strong>and</strong> instruction). This invalidation is broadcast throughout thecoherence domain. If, at any level of the cache hierarchy, a line is inconsistent withmemory (dirty), it is written to memory before invalidation.Other characteristicsinclude:• The data size affected is the cache coherency size, which is <strong>64</strong> bytes on Pentium 4processor.• The memory attribute of the page containing the affected line has no effect onthe behavior of this instruction.• The CLFLUSH instruction can be used at all privilege levels <strong>and</strong> is subject to allpermission checking <strong>and</strong> faults associated with a byte load.CLFLUSH is an unordered operation with respect to other memory traffic, includingother CLFLUSH instructions. Software should use a memory fence for cases whereordering is a concern.As an example, consider a video usage model where a video capture device is usingnon-coherent AGP accesses to write a capture stream directly to system memory.Since these non-coherent writes are not broadcast on the processor bus, they will notflush copies of the same locations that reside in the processor caches. As a result,before the processor re-reads the capture buffer, it should use CLFLUSH to ensurethat stale copies of the capture buffer are flushed from the processor caches. Due tospeculative reads that may be generated by the processor, it is important to observeappropriate fencing (using MFENCE).Example 9-1 provides pseudo-code for CLFLUSH usage.Example 9-1. Pseudo-code Using CLFLUSHwhile (!buffer_ready} {}mfencefor(i=0;i

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!