11.01.2013 Views

IBM AIX Continuous Availability Features - IBM Redbooks

IBM AIX Continuous Availability Features - IBM Redbooks

IBM AIX Continuous Availability Features - IBM Redbooks

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

This means that traditional <strong>AIX</strong> system trace is the preferred Second Failure Data Capture<br />

(SFDC) tool, because you can more precisely specify the exact trace hooks of interest, given<br />

knowledge gained from the initial failure.<br />

Traditional <strong>AIX</strong> system trace also provides options that allow you to automatically write the<br />

trace information to a disk-based file (such as /var/adm/ras/trcfile). Lightweight memory trace<br />

provides no such option to automatically write the trace entries to disk when the memory<br />

trace buffer fills. When an LMT memory trace buffer fills, it “wraps”, meaning that the oldest<br />

trace record is overwritten.<br />

The value of LMT derives from being able to view some history of what the system was doing<br />

prior to reaching the point where a failure is detected. As previously mentioned, each CPU<br />

has a memory trace buffer for common events, and a smaller memory trace buffer for rare<br />

events.<br />

The intent is for the “common” buffer to have a 1- to 2-second retention (in other words, have<br />

enough space to record events occurring during the last 1 to 2 seconds, without wrapping).<br />

The “'rare” buffer should have at least an hour's retention. This depends on workload, and on<br />

where developers place trace hook calls in the <strong>AIX</strong> kernel source and which parameters they<br />

trace.<br />

Disabling and enabling lightweight memory trace<br />

You can disable lightweight memory trace by using the following command:<br />

/usr/bin/raso -r -o mtrc_enabled=0<br />

You can enable lightweight memory trace by using the following command:<br />

/usr/bin/raso -r -o mtrc_enabled=1<br />

Note: In either case, the boot image must be rebuilt (the bosboot command needs to be<br />

run), and the change does not take effect until the next reboot.<br />

Lightweight Memory Trace memory consumption<br />

The default amount of memory required for the memory trace buffers is automatically<br />

calculated based on factors that influence software trace record retention, with the target<br />

being sufficiently large buffers to meet the retention goals previously described.<br />

There are several factors that may reduce the amount of memory automatically used. The<br />

behavior differs slightly between the 32-bit (unix_mp) and 64-bit (unix_64) kernels. For the<br />

64-bit kernel, the default calculation is limited such that no more than 1/128 of system<br />

memory can be used by LMT, and no more than 256 MB by a single processor.<br />

The 32-bit kernel uses the same default memory buffer size calculations, but further restricts<br />

the total memory allocated for LMT (all processors combined) to 16 MB. Table 3-1 presents<br />

some example LMT memory consumption.<br />

Table 3-1 Lightweight Memory Trace memory consumption<br />

Machine Number of<br />

CPUs<br />

POWER3 (375<br />

MHz CPU)<br />

POWER3 (375<br />

MHz CPU)<br />

58 <strong>IBM</strong> <strong>AIX</strong> <strong>Continuous</strong> <strong>Availability</strong> <strong>Features</strong><br />

System memory Total LMT<br />

memory: 64-bit<br />

kernel<br />

1 1 GB 8 MB 8 MB<br />

2 4 GB 16 MB 16 MB<br />

Total LMT<br />

memory: 32-bit<br />

kernel

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!