IBM AIX Continuous Availability Features - IBM Redbooks
IBM AIX Continuous Availability Features - IBM Redbooks
IBM AIX Continuous Availability Features - IBM Redbooks
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
This means that traditional <strong>AIX</strong> system trace is the preferred Second Failure Data Capture<br />
(SFDC) tool, because you can more precisely specify the exact trace hooks of interest, given<br />
knowledge gained from the initial failure.<br />
Traditional <strong>AIX</strong> system trace also provides options that allow you to automatically write the<br />
trace information to a disk-based file (such as /var/adm/ras/trcfile). Lightweight memory trace<br />
provides no such option to automatically write the trace entries to disk when the memory<br />
trace buffer fills. When an LMT memory trace buffer fills, it “wraps”, meaning that the oldest<br />
trace record is overwritten.<br />
The value of LMT derives from being able to view some history of what the system was doing<br />
prior to reaching the point where a failure is detected. As previously mentioned, each CPU<br />
has a memory trace buffer for common events, and a smaller memory trace buffer for rare<br />
events.<br />
The intent is for the “common” buffer to have a 1- to 2-second retention (in other words, have<br />
enough space to record events occurring during the last 1 to 2 seconds, without wrapping).<br />
The “'rare” buffer should have at least an hour's retention. This depends on workload, and on<br />
where developers place trace hook calls in the <strong>AIX</strong> kernel source and which parameters they<br />
trace.<br />
Disabling and enabling lightweight memory trace<br />
You can disable lightweight memory trace by using the following command:<br />
/usr/bin/raso -r -o mtrc_enabled=0<br />
You can enable lightweight memory trace by using the following command:<br />
/usr/bin/raso -r -o mtrc_enabled=1<br />
Note: In either case, the boot image must be rebuilt (the bosboot command needs to be<br />
run), and the change does not take effect until the next reboot.<br />
Lightweight Memory Trace memory consumption<br />
The default amount of memory required for the memory trace buffers is automatically<br />
calculated based on factors that influence software trace record retention, with the target<br />
being sufficiently large buffers to meet the retention goals previously described.<br />
There are several factors that may reduce the amount of memory automatically used. The<br />
behavior differs slightly between the 32-bit (unix_mp) and 64-bit (unix_64) kernels. For the<br />
64-bit kernel, the default calculation is limited such that no more than 1/128 of system<br />
memory can be used by LMT, and no more than 256 MB by a single processor.<br />
The 32-bit kernel uses the same default memory buffer size calculations, but further restricts<br />
the total memory allocated for LMT (all processors combined) to 16 MB. Table 3-1 presents<br />
some example LMT memory consumption.<br />
Table 3-1 Lightweight Memory Trace memory consumption<br />
Machine Number of<br />
CPUs<br />
POWER3 (375<br />
MHz CPU)<br />
POWER3 (375<br />
MHz CPU)<br />
58 <strong>IBM</strong> <strong>AIX</strong> <strong>Continuous</strong> <strong>Availability</strong> <strong>Features</strong><br />
System memory Total LMT<br />
memory: 64-bit<br />
kernel<br />
1 1 GB 8 MB 8 MB<br />
2 4 GB 16 MB 16 MB<br />
Total LMT<br />
memory: 32-bit<br />
kernel