Using CoreSight Trace Techniques on Cortex - IAR Systems
Using CoreSight Trace Techniques on Cortex - IAR Systems
Using CoreSight Trace Techniques on Cortex - IAR Systems
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
y Ryan Sheng, <strong>IAR</strong> <strong>Systems</strong><br />
<str<strong>on</strong>g>Using</str<strong>on</strong>g> <str<strong>on</strong>g>CoreSight</str<strong>on</strong>g> <str<strong>on</strong>g>Trace</str<strong>on</strong>g> <str<strong>on</strong>g>Techniques</str<strong>on</strong>g> <strong>on</strong> <strong>Cortex</strong>-M3/M4<br />
The debug system of ARM <strong>Cortex</strong>-M3/M4 is based <strong>on</strong> the new <str<strong>on</strong>g>CoreSight</str<strong>on</strong>g> architecture. In additi<strong>on</strong> to<br />
traditi<strong>on</strong>al invasive operati<strong>on</strong>s, <str<strong>on</strong>g>CoreSight</str<strong>on</strong>g>-based designs enable the memory and peripheral registers to<br />
be examined even when the CPU is running. Furthermore, <str<strong>on</strong>g>CoreSight</str<strong>on</strong>g> architecture also introduces<br />
powerful trace capabilities that include:<br />
• Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g>, generating events to record data reads/writes, excepti<strong>on</strong>s/interrupts, and PC<br />
(program counter) sampling informati<strong>on</strong>.<br />
• Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g>, supporting output of debug messages (e.g. printf) to the host.<br />
• Instructi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g>, collecting a sequence of every executed instructi<strong>on</strong> c<strong>on</strong>tinuously for a<br />
selected porti<strong>on</strong> of your applicati<strong>on</strong>.<br />
Once generated, trace data can be transferred to the host through a debug probe such as <strong>IAR</strong> J-Link or<br />
<strong>IAR</strong> J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g>. <strong>IAR</strong> C-SPY Debugger can then collect and display this informati<strong>on</strong> in various windows for<br />
analysis.<br />
<str<strong>on</strong>g>Trace</str<strong>on</strong>g> data can be useful for locating errors that have irregular symptoms and occur sporadically. It is<br />
also helpful for analyzing dynamic system behavior, optimizing performance bottlenecks, and counting<br />
code coverage statistics.<br />
Debug Interface and <str<strong>on</strong>g>Trace</str<strong>on</strong>g> System<br />
Unlike ARM7/ARM9, the <strong>Cortex</strong>-M3/M4 core itself does not have a JTAG interface. Instead, the Debug<br />
Port (DP) module is decoupled from the core. Current DPs can support the well-known JTAG interface<br />
and the Serial-Wire Debug (SWD) interface.<br />
There are up to three comp<strong>on</strong>ents in <strong>Cortex</strong>-M3/M4 that can be a trace source: DWT (Data Watchpoint<br />
and <str<strong>on</strong>g>Trace</str<strong>on</strong>g>, for Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g>), ITM (Instrumentati<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g> Macrocell, for Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g>), and ETM<br />
(Embedded <str<strong>on</strong>g>Trace</str<strong>on</strong>g> Macrocell, for full Instructi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g>). DWT, ITM and ETM generate trace data in the<br />
form of packets and transfer them through an Advanced <str<strong>on</strong>g>Trace</str<strong>on</strong>g> Bus (ATB) to the <str<strong>on</strong>g>Trace</str<strong>on</strong>g> Port Interface<br />
Unit (TPIU). TPIU has two operati<strong>on</strong> modes: Clocked mode, using up to 4-bit (1, 2 or 4-bit) parallel data<br />
outputs, and SWV (Serial-Wire Viewer) mode, using a single-bit SWO (Serial Wire Output) output<br />
format. Instructi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g> (from ETM) must use the parallel trace port, while packets of Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g> and<br />
Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g> normally use SWO (called SWO <str<strong>on</strong>g>Trace</str<strong>on</strong>g>) but can also be multiplexed with the ETM trace<br />
stream through the parallel trace port.<br />
The figure below shows the diagram of a <strong>Cortex</strong>-M3/M4 trace system. JTAG/SWD, SWO and the 4-bit<br />
parallel trace port can be deployed into a 19-pin (0.05”) <strong>Cortex</strong> Debug + ETM c<strong>on</strong>nector <strong>on</strong> the target.<br />
<strong>IAR</strong> J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g> for <strong>Cortex</strong>-M3 can use this c<strong>on</strong>nector directly and support all these trace capabilities<br />
(including ETM).<br />
Note that the ETM comp<strong>on</strong>ent is opti<strong>on</strong>al. Not all <strong>Cortex</strong>-M3/M4 devices have ETM, but nearly all of<br />
them support JTAG/SWD debugging and SWO <str<strong>on</strong>g>Trace</str<strong>on</strong>g>. SWO and JTAG/SWD signals can be deployed<br />
into a 20-pin (0.1”) standard JTAG c<strong>on</strong>nector. Both <strong>IAR</strong> J-Link and <strong>IAR</strong> J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g> can use this c<strong>on</strong>nector<br />
directly and support the SWO channel. <strong>IAR</strong> J-Link is a more affordable probe for ARM targets without<br />
ETM, supporting all ARM7/ARM9/ARM11/<strong>Cortex</strong>-M0/M1/M3/M4/R4 cores.
Note that the TDO signal of JTAG is multiplexed with SWO in both c<strong>on</strong>nectors (see the figure below), so<br />
that SWO <str<strong>on</strong>g>Trace</str<strong>on</strong>g> is not accessible when the DP is in a JTAG c<strong>on</strong>figurati<strong>on</strong>. Only the SWD interface can<br />
be used together with SWO.<br />
<str<strong>on</strong>g>Trace</str<strong>on</strong>g>-related Debugger Features of C-SPY<br />
When debugging with <strong>IAR</strong> J-Link/<strong>IAR</strong> J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g> in C-SPY, you can use trace-related windows such as<br />
SWO or ETM <str<strong>on</strong>g>Trace</str<strong>on</strong>g>, Functi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g>, Timeline, Interrupt Log, Interrupt Log Summary, Data Log, Data<br />
Log Summary and Find in <str<strong>on</strong>g>Trace</str<strong>on</strong>g>. Various types of trace breakpoints and triggers are available to c<strong>on</strong>trol<br />
the collecti<strong>on</strong> of trace data. In additi<strong>on</strong>, several other features of C-SPY also use trace data, such as the<br />
Profiler and Code coverage. Please note that the SWO channel does not have unlimited throughput,<br />
thus it is usually not possible to use all the above features at the same time.<br />
Page 2
When debugging, two indicators labeled ETM and SWO respectively, are visible <strong>on</strong> the IDE main<br />
window toolbar (as the figure below). If any of these indicators is green, it means that the corresp<strong>on</strong>ding<br />
trace hardware is generating trace data. Just point at the indicator with your mouse to get detailed tips<br />
about which C-SPY features that have requested trace data generati<strong>on</strong>. This is useful, for example, if<br />
the SWO communicati<strong>on</strong> channel often overflows because too many C-SPY features are enabled for<br />
trace data usage.<br />
Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g><br />
The DWT (Data Watchpoint and <str<strong>on</strong>g>Trace</str<strong>on</strong>g>) comp<strong>on</strong>ent generates Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g> informati<strong>on</strong> in <strong>Cortex</strong>-M3/M4.<br />
It has four independent watchpoints (comparators) which can be programmed to compare data<br />
addresses or program counters. If there is a matched comparis<strong>on</strong>, the watchpoint can therefore halt the<br />
processor, generate Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g> packets or trigger ETM. DWT also c<strong>on</strong>tains a PC sampler, an interrupt<br />
trace module and a set of counters for CPU cycle statistics. It is capable of recording data reads and<br />
writes with timing statistics, logging the executi<strong>on</strong> of excepti<strong>on</strong>s and interrupts, sampling the PC<br />
(program counter) at regular intervals, etc.<br />
<str<strong>on</strong>g>Trace</str<strong>on</strong>g> packets generated by DWT are transferred to ITM through ATB at first. ITM acts as a merging unit<br />
to combine its own trace data with DWT packets, and it is resp<strong>on</strong>sible for sending the merged packets<br />
stream to TPIU.<br />
Example 1: M<strong>on</strong>itor the value of static variables<br />
Example 2: Measure the time c<strong>on</strong>sumpti<strong>on</strong> of a piece of code<br />
Example 3: View the graph of interrupts<br />
Example 4: Statistical code profiling <strong>on</strong> functi<strong>on</strong> level<br />
Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g><br />
The ITM (Instrumentati<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g> Macrocell) comp<strong>on</strong>ent of a <strong>Cortex</strong>-M3/M4 device can have three<br />
operati<strong>on</strong> modes (trace sources):<br />
• Applicati<strong>on</strong> software can write c<strong>on</strong>sole messages directly to ITM stimulus ports and output<br />
them to the host as trace packets. This is normally called Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g>.<br />
• ITM can merge and forward trace packets generated from DWT to TPIU.<br />
• ITM can generate timestamp packets that are inserted into the trace stream, to help the host<br />
debugger to find out the timing of events.<br />
Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g> is a printf-style debugging method that is driven directly by the applicati<strong>on</strong>. It is used for<br />
sending data from the target applicati<strong>on</strong> to the host debugger without stopping the program executi<strong>on</strong>.<br />
ITM c<strong>on</strong>tains 32 stimulus ports, allowing different software routines to access different ports, and the<br />
messages can be collected by the host debugger. Each ITM stimulus port can be enabled or disabled<br />
individually. Unlike traditi<strong>on</strong>al terminal I/O semihosting mechanisms based <strong>on</strong> breakpoints, using<br />
Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g> to output debug messages does not cause much delay for the applicati<strong>on</strong>. The speed<br />
will be comparable to using a high speed UART.<br />
• Example 5: Printf via SWO<br />
• Example 6: Direct output via ITM stimulus ports<br />
Page 3
Instructi<strong>on</strong> trace<br />
Instructi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g> (also known as ETM trace) is a c<strong>on</strong>tinuously collected sequence of every executed<br />
instructi<strong>on</strong> for a selected porti<strong>on</strong> of the applicati<strong>on</strong>. Note that the ETM (Embedded <str<strong>on</strong>g>Trace</str<strong>on</strong>g> Macrocell)<br />
comp<strong>on</strong>ent is opti<strong>on</strong>al so that some <strong>Cortex</strong>-M3/M4 devices might not support Instructi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g>. When<br />
ETM is included and enabled, it generates trace packets and sends them to TPIU. A special hardware<br />
probe (e.g. <strong>IAR</strong> J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g> for <strong>Cortex</strong>-M3) is required to receive these packets and transfer them to the<br />
host debugger. The J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g> probe c<strong>on</strong>tains a 4MB buffer that collects instructi<strong>on</strong>s in real time, but the<br />
trace data will not be displayed in the ETM <str<strong>on</strong>g>Trace</str<strong>on</strong>g> window of C-SPY until the executi<strong>on</strong> has been<br />
stopped.<br />
Since the (up to) 4-bit trace port is not sufficient to transfer all executed instructi<strong>on</strong>s, ETM does not<br />
actually output every address/instructi<strong>on</strong> that the processor has reached or executed. It usually<br />
generates compressed informati<strong>on</strong> about the program flow and outputs full addresses <strong>on</strong>ly if needed<br />
(e.g. if a branch has taken place). Since the debugger knows the applicati<strong>on</strong> code image, it can then<br />
rec<strong>on</strong>struct the full instructi<strong>on</strong> sequence from the trace data.<br />
Supported by a probe such as <strong>IAR</strong> J-<str<strong>on</strong>g>Trace</str<strong>on</strong>g> for <strong>Cortex</strong>-M3, Instructi<strong>on</strong> <str<strong>on</strong>g>Trace</str<strong>on</strong>g> can be used with SWO<br />
trace (Data <str<strong>on</strong>g>Trace</str<strong>on</strong>g> and Software <str<strong>on</strong>g>Trace</str<strong>on</strong>g>) c<strong>on</strong>currently. In this case, trace data from DWT and ITM will also<br />
be collected to the ETM trace buffer, instead of being streamed via the SWO channel immediately. This<br />
means that DWT and ITM trace data will not be displayed until the executi<strong>on</strong> has been stopped, instead<br />
of being c<strong>on</strong>tinuously updated in the C-SPY windows.<br />
• Example 7: Instructi<strong>on</strong> trace and functi<strong>on</strong> trace<br />
• Example 8: View the call stack graph<br />
• Example 9: Statistics of code coverage<br />
Page 4