13.07.2015 Views

Cortex-A8 R2P2.pdf - ARM Information Center

Cortex-A8 R2P2.pdf - ARM Information Center

Cortex-A8 R2P2.pdf - ARM Information Center

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Instruction Cycle TimingReplayeventDelayTable 16-13 Memory system effects on instruction timings (continued)DescriptionData TLBmiss24 cycles 1. A table walk because of a miss in the L1 TLB causes a 24-cycle delay, assumingthe translation table entries are found in the L2 cache.2. If the translation table entries are not present in the L2 cache, the number of stallcycles depends on the external system memory timing.Storebuffer full8 cyclesplus latencyto drain fillbuffer1. A store instruction miss does not result in any stalls unless the store buffer isfull.2. In the case of a full store buffer, the delay is at least eight cycles. The delay canbe more if it takes longer to drain some entries from the store buffer.Unalignedload orstorerequest8 cycles 1. If a load instruction address is unaligned and the full access is not containedwithin a 128-bit boundary, there is a 8-cycle penalty.2. If a store instruction address is unaligned and the full access is not containedwithin a 64-bit boundary, there is a 8-cycle penalty.16.4.3 Thumb-2 instructionsAs a general rule, Thumb-2 instructions are executed with timing constraints identicalto their <strong>ARM</strong> counterparts. However, there are some second order effects to the cycletiming that you must observe. First, the code footprint is smaller, which can reduce thenumber of instruction cache misses and therefore reduce the cycle count. Second,branch instructions tend to be more densely packed, slightly reducing the branchprediction accuracy that is achieved and therefore increasing the number of branchmispredictions. Neither of these effects can be accurately measured using handcalculating techniques.NoteThe code footprint and densely packed branch instructions can have an impact on theperformance of the processor. In most cases, the interaction of these effects mightcancel with each other.16.4.4 ThumbEE instructionsThe majority of the ThumbEE instruction set is identical in both encodings and behaviorto the Thumb-2 instruction set and therefore the cycle timings are also identical to theThumb-2 instruction timings. The behavior of some instructions are different whenexecuted in ThumbEE state instead of in Thumb state. However, the behavior changesfor these instructions do not result in any changes to their cycle timing. The onlyadditional cycle timing information for ThumbEE is for the new instructions.<strong>ARM</strong> DDI 0344E Copyright © 2006-2008 <strong>ARM</strong> Limited. All rights reserved. 16-15

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!