Fast Models Reference Manual - ARM Information Center

More documents

Recommendations

Info

Accuracy and Functionality Real processors attempt to prefetch instructions ahead of execution and predict branch destinations to keep the prefetch queue full. The instruction prefetch behavior of a processor can be observed by a program that writes into its own prefetch queue (without using explicit barriers). The architecture does not define the results. The CT engine processes code in blocks. The effect is as if the processor filled its prefetch queue with a block of instructions, then executed the block to completion. As a result, this virtual prefetch queue is sometimes larger and sometimes smaller than the corresponding hardware. In the current implementation, the virtual prefetch queue can follow small forward branches. With an L1 instruction cache turned on, the instruction block size is limited to a single cache-line. The processor ensures that a line is present in the cache at the point where it starts executing instructions from that line. In real hardware, the effects of the instruction prefetch queue are to cause additional fetch transactions, some of these are redundant because of incorrect branch prediction. This causes extra cache and bus pressure. 2.3.4 Out-of-order execution and write-buffers The current CT implementation always executes instructions sequentially in program order. One instruction is completely retired before the next starts to execute. In a real processor, multiple memory accesses can be outstanding at once, and can complete in a different order from their program order. Writes can also be delayed in a write-buffer. The programmer visible effects of these behaviors is defined in the architecture as the Weakly Ordered memory model, which the programmer must be aware of when writing lock-free multi-processor code. Within Fast Models, all memory accesses can be observed to happen in program order, effectively as if all memory is Strongly Ordered. 2.3.5 Caches The effects of caches are programmer visible because they can cause a single memory location to exist as multiple inconsistent copies. If caches are not correctly maintained, reads can observe stale copies of locations, and flushes/cleans can cause writes to be lost. There are three ways in which incorrect cache maintenance can be programmer visible: From the D-Side interface of a single processor The only way of detecting the presence of caches is to create aliases in the memory map, so that the same range of physical addresses can be observed as both cached and non-cached memory. From the D-Side of a single processor to its I-Side Stale instruction data can be fetched when new instructions have been written by the D-side. This can either be due to deliberate self-modifying code, or as a consequence of incorrect OS demand paging. Between one processor and another device For example, another processor in a non-coherent MP system, or an external DMA device. ARM DUI 0423J Copyright © 2008-2011 ARM. All rights reserved. 2-7 ID051811 Non-Confidential
Accuracy and Functionality Fast Models with cache-state modelling enabled can replicate all of these failure-cases. However, they do not attempt to reproduce the following effects of caches: • Changes to timing behavior of a program due to cache hits/misses (because the timing of memory accesses is not modeled). • Ordering of line-fill and eviction operations. • Cache usage statistics (because the models do not generate accurate bus traffic). • Device-accurate allocation of cache victim lines (which is not possible without accurate bus traffic modelling). • Write-streaming behavior where a cache spots patterns of writes to consecutive addresses and automatically turns off the write-allocation policy. Models with coherent L1 data-caches currently do not model device-accurate MESI behavior. For example, cache-to-cache line transfers at L1 are handled in the model by flushing the data to L2 and reading it back. It is not currently possible to insert any devices between the processor and its L1 caches. In particular, you can not model L1 traffic, although you can tell the model not to model the state of L1 caches. ARM DUI 0423J Copyright © 2008-2011 ARM. All rights reserved. 2-8 ID051811 Non-Confidential
Page 1 and 2: Fast Models Version 6.1 Reference M
Page 3 and 4: Contents Fast Models Reference Manu
Page 5 and 6: Preface This preface introduces the
Page 7 and 8: Preface Typographical conventions T
Page 9 and 10: Preface Feedback ARM welcomes feedb
Page 11 and 12: Introduction 1.1 About the componen
Page 13 and 14: Accuracy and Functionality 2.1 Mode
Page 15 and 16: Accuracy and Functionality 2.2.2 Pe
Page 17: Accuracy and Functionality 2. For a
Page 21 and 22: Signaling and Clocking Protocols 3.
Page 23 and 24: Signaling and Clocking Protocols 3.
Page 25 and 26: Signaling and Clocking Protocols Pa
Page 27 and 28: Signaling and Clocking Protocols Ve
Page 29 and 30: Chapter 4 Processor Components This
Page 31 and 32: Processor Components 4.2 ARMCortexA
Page 33 and 34: Processor Components InstructionCou
Page 35 and 36: Processor Components c. This is a m
Page 39 and 40: Processor Components Table 4-4 ARMC
Page 43 and 44: Processor Components 4.3.9 Library
Page 47 and 48: Processor Components a. The ase-pre
Page 49 and 50: Processor Components 4.4.8 Performa
Page 51 and 52: Processor Components 4.5.3 Paramete
Page 53 and 54: Processor Components The debugger m
Page 55 and 56: Processor Components 4.6.1 Ports Ta
Page 57 and 58: Processor Components Table 4-13 ARM
Page 59 and 60: Processor Components 4.6.5 Caches T
Page 63 and 64: Processor Components Table 4-17 ARM
Page 65 and 66: Processor Components 4.7.8 Performa
Page 67 and 68: Processor Components Name Port Prot
Page 69 and 70:
Processor Components Table 4-19 ARM
Page 71 and 72:
Processor Components b. Currently i
Page 73 and 74:
Processor Components 4.9 ARMCortexR
Page 75 and 76:
Page 77 and 78:
Processor Components 4.9.9 Library
Page 79 and 80:
Processor Components Name Port prot
Page 81 and 82:
Processor Components The debugger m
Page 83 and 84:
Processor Components 4.11 ARMCortex
Page 85 and 86:
Page 87 and 88:
Processor Components 4.12 ARMv7A -
Page 89 and 90:
Processor Components pwrctli[0-3] V
Page 91 and 92:
Processor Components Table 4-29 Pro
Page 93 and 94:
Processor Components Cache geometry
Page 95 and 96:
Processor Components Note In the vi
Page 97 and 98:
Processor Components Message config
Page 99 and 100:
Processor Components All entries in
Page 101 and 102:
Processor Components 4.13.3 Paramet
Page 103 and 104:
Processor Components Memory The ARM
Page 105 and 106:
Page 107 and 108:
Processor Components Memory The ARM
Page 109 and 110:
Page 111 and 112:
Processor Components 4.15.5 Debug f
Page 113 and 114:
Processor Components 4.16 ARM926CT
Page 115 and 116:
Processor Components Registers All
Page 117 and 118:
Processor Components • Data/Instr
Page 119 and 120:
Processor Components DMA DMA is cur
Page 121 and 122:
Processor Components Table 4-47 Cor
Page 123 and 124:
Peripheral and Interface Components
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132:
Page 133 and 134:
Page 135 and 136:
Page 137 and 138:
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Page 163 and 164:
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
Page 187 and 188:
Page 189 and 190:
Page 191 and 192:
Page 193 and 194:
Page 195 and 196:
Page 197 and 198:
Page 199 and 200:
Page 201 and 202:
Page 203 and 204:
Page 205 and 206:
Page 207 and 208:
Page 209 and 210:
Page 211 and 212:
Page 213 and 214:
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
Page 221 and 222:
Page 223 and 224:
Page 225 and 226:
Page 227 and 228:
Page 229 and 230:
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
Page 239 and 240:
Page 241 and 242:
Page 243 and 244:
Page 245 and 246:
Page 247 and 248:
Page 249 and 250:
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
Page 259 and 260:
Page 261 and 262:
Page 263 and 264:
Page 265 and 266:
Page 267 and 268:
Page 269 and 270:
Page 271 and 272:
Page 273 and 274:
Page 275 and 276:
Page 277 and 278:
Page 279 and 280:
Page 281 and 282:
Page 283 and 284:
Page 285 and 286:
Page 287 and 288:
Page 289 and 290:
Page 291 and 292:
Page 293 and 294:
Versatile Express Model: Platform a
Page 295 and 296:
Page 297 and 298:
Page 299 and 300:
Page 301 and 302:
Page 303 and 304:
Page 305 and 306:
Page 307 and 308:
Page 309 and 310:
Page 311 and 312:
Page 313 and 314:
Chapter 7 Emulation Baseboard Model
Page 315 and 316:
Emulation Baseboard Model: Platform
Page 317 and 318:
Page 319 and 320:
Page 321 and 322:
Page 323 and 324:
Page 325 and 326:
Page 327 and 328:
Page 329 and 330:
Page 331 and 332:
Page 333 and 334:
Page 335 and 336:
Page 337 and 338:
Page 339 and 340:
Microcontroller Prototyping System:
Page 341 and 342:
Page 343 and 344:
Page 345 and 346:
Page 347 and 348:
Page 349 and 350:
Page 351 and 352:
Page 353 and 354:
Page 355 and 356:
AEM ARMv7-A specifics A.1 Boundary
Page 357 and 358:
AEM ARMv7-A specifics When paramete
Page 359 and 360:
AEM ARMv7-A specifics When stepping
Page 361 and 362:
AEM ARMv7-A specifics A.2 Debug arc
Page 363 and 364:
AEM ARMv7-A specifics A.3 IMPLEMENT
Page 365 and 366:
AEM ARMv7-A specifics A.4 Trace The
Page 367 and 368:
Glossary DSR DTR EB GIC GPIO I/O KM
show all

Fast Models Reference Manual - ARM Information Center

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?