QLogic OFED+ Host Software User Guide, Rev. B

More documents

Recommendations

Info

4–Running QLogic MPI on QLogic Adapters Debugging MPI Programs To use this feature, the application must be compiled with both OpenMP and MPI code enabled. To do this, use the -mp flag on the mpicc compile line. As mentioned previously, MPI routines can be called only by the master OpenMP thread. The hybrid executable is executed as usual using mpirun, but typically only one MPI process is run per node and the OpenMP library will create additional threads to utilize all CPUs on that node. If there are sufficient CPUs on a node, you may want to run multiple MPI processes and multiple OpenMP threads per node. The number of OpenMP threads is typically controlled by the OMP_NUM_THREADS environment variable in the .mpirunrc file. (OMP_NUM_THREADS is used by other compilers’ OpenMP products, but is not a QLogic MPI environment variable.) Use this variable to adjust the split between MPI processes and OpenMP threads. Usually, the number of MPI processes (per node) times the number of OpenMP threads will be set to match the number of CPUs per node. An example case would be a node with four CPUs, running one MPI process and four OpenMP threads. In this case, OMP_NUM_THREADS is set to four. OMP_NUM_THREADS is on a per-node basis. See “Environment for Node Programs” on page 4-19 for information on setting environment variables. At the time of publication, the MPI_THREAD_SERIALIZED and MPI_THREAD_MULTIPLE models are not supported. Debugging MPI Programs MPI Errors NOTE: When there are more threads than CPUs, both MPI and OpenMP performance can be significantly degraded due to over-subscription of the CPUs. Debugging parallel programs is substantially more difficult than debugging serial programs. Thoroughly debugging the serial parts of your code before parallelizing is good programming practice. Almost all MPI routines (except MPI_Wtime and MPI_Wtick) return an error code; either as the function return value in C functions or as the last argument in a Fortran subroutine call. Before the value is returned, the current MPI error handler is called. By default, this error handler aborts the MPI job. Therefore, you can get information about MPI exceptions in your code by providing your own handler for MPI_ERRORS_RETURN. See the man page for the MPI_Errhandler_set for details. 4-26 D000046-005 B
4–Running QLogic MPI on QLogic Adapters Debugging MPI Programs NOTE: MPI does not guarantee that an MPI program can continue past an error. Using Debuggers See the standard MPI documentation referenced in Appendix J for details on the MPI error codes. The InfiniPath software supports the use of multiple debuggers, including pathdb, gdb, and the system call tracing utility strace. These debuggers let you set breakpoints in a running program, and examine and set its variables. Symbolic debugging is easier than machine language debugging. To enable symbolic debugging, you must have compiled with the -g option to mpicc so that the compiler will have included symbol tables in the compiled object code. To run your MPI program with a debugger, use the -debug or -debug-no-pause and -debugger options for mpirun. See the man pages to pathdb, gdb, and strace for details. When running under a debugger, you get an xterm window on the front end machine for each node process. Therefore, you can control the different node processes as desired. To use strace with your MPI program, the syntax is: $ mpirun -np n -m mpihosts strace program-name The following features of QLogic MPI facilitate debugging: • Stack backtraces are provided for programs that crash. • The -debug and -debug-no-pause options are provided for mpirun. These options make each node program start with debugging enabled. The -debug option allows you to set breakpoints, and start running programs individually. The -debug-no-pause option allows postmortem inspection. Be sure to set -q 0 when using -debug. • Communication between mpirun and node programs can be printed by specifying the mpirun -verbose option. • MPI implementation debug messages can be printed by specifying the mpirun -psc-debug-level option. This option can substantially impact the performance of the node program. • Support is provided for progress timeout specifications, deadlock detection, and generating information about where a program is stuck. D000046-005 B 4-27
Page 1 and 2:
QLogic OFED+ Host Software User Gui
Page 3 and 4:
Table of Contents 1 Introduction Ho
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Preface The QLogic OFED+ Host Softw
Page 17 and 18:
Preface Technical Support Technical
Page 19 and 20:
1 Introduction How this Guide is Or
Page 21 and 22:
1-Introduction Interoperability QLo
Page 23 and 24:
2 Step-by-Step Cluster Setup and MP
Page 25 and 26:
3 TrueScale Cluster Setup and Admin
Page 27 and 28:
3-TrueScale Cluster Setup and Admin
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38: 3-TrueScale Cluster Setup and Admin
Page 63 and 64: 4 Running QLogic MPI on QLogic Adap
Page 65 and 66: 4-Running QLogic MPI on QLogic Adap
Page 87: 4-Running QLogic MPI on QLogic Adap
Page 91 and 92: 5 Using Other MPIs Introduction Thi
Page 93 and 94: 5-Using Other MPIs Open MPI Open MP
Page 95 and 96: 5-Using Other MPIs MVAPICH Further
Page 97 and 98: 5-Using Other MPIs Managing Open MP
Page 99 and 100: 5-Using Other MPIs Platform (Scali)
Page 101 and 102: 5-Using Other MPIs Intel MPI Intel
Page 103 and 104: 5-Using Other MPIs Intel MPI When u
Page 105 and 106: 5-Using Other MPIs Improving Perfor
Page 107 and 108: 6 Performance Scaled Messaging Intr
Page 109 and 110: 6-Performance Scaled Messaging Usin
Page 111 and 112: 7 Dispersive Routing Infiniband use
Page 113 and 114: 7-Dispersive Routing • Static_Des
Page 115 and 116: 8 gPXE gPXE Setup gPXE is an open s
Page 117 and 118: 8-gPXE Preparing the DHCP Server in
Page 119 and 120: 8-gPXE Netbooting Over InfiniBand P
Page 121 and 122: 8-gPXE Netbooting Over InfiniBand W
Page 123 and 124: 8-gPXE Netbooting Over InfiniBand #
Page 125 and 126: 8-gPXE Netbooting Over InfiniBand #
Page 127 and 128: 8-gPXE Netbooting Over InfiniBand T
Page 129 and 130: 8-gPXE HTTP Boot Setup 5. Create an
Page 131 and 132: A mpirun Options Summary This secti
Page 133 and 134: A-mpirun Options Summary Quiescence
Page 135 and 136: A-mpirun Options Summary Tuning Opt
Page 137 and 138: A-mpirun Options Summary Format Opt
Page 139 and 140:
B Benchmark Programs Several MPI pe
Page 141 and 142:
B-Benchmark Programs Benchmark 2: M
Page 143 and 144:
B-Benchmark Programs Benchmark 4: M
Page 145 and 146:
C VirtualNIC Interface Configuratio
Page 147 and 148:
C-VirtualNIC Interface Configuratio
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
D SRP Configuration SRP Configurati
Page 163 and 164:
D-SRP Configuration QLogic SRP Conf
Page 165 and 166:
Page 167 and 168:
Page 169 and 170:
Page 171 and 172:
Page 173 and 174:
Page 175 and 176:
Page 177 and 178:
Page 179 and 180:
Page 181 and 182:
Page 183 and 184:
Page 185 and 186:
D-SRP Configuration OFED SRP Config
Page 187 and 188:
E Integration with a Batch Queuing
Page 189 and 190:
E-Integration with a Batch Queuing
Page 191 and 192:
E-Integration with a Batch Queuing
Page 193 and 194:
F Troubleshooting This appendix des
Page 195 and 196:
F-Troubleshooting Kernel and Initia
Page 197 and 198:
F-Troubleshooting Kernel and Initia
Page 199 and 200:
F-Troubleshooting OpenFabrics and I
Page 201 and 202:
F-Troubleshooting Performance Issue
Page 203 and 204:
F-Troubleshooting QLogic MPI Troubl
Page 205 and 206:
Page 207 and 208:
Page 209 and 210:
Page 211 and 212:
Page 213 and 214:
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
Page 221 and 222:
Page 223 and 224:
Page 225 and 226:
G ULP Troubleshooting Troubleshooti
Page 227 and 228:
G-ULP Troubleshooting Troubleshooti
Page 229 and 230:
Page 231 and 232:
Page 233 and 234:
Page 235 and 236:
Page 237 and 238:
Page 239 and 240:
Page 241 and 242:
H Write Combining Introduction Writ
Page 243 and 244:
H-Write Combining MTRR Mapping and
Page 245 and 246:
I Useful Programs and Files The mos
Page 247 and 248:
I-Useful Programs and Files Summary
Page 249 and 250:
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
Page 259 and 260:
Page 261 and 262:
I-Useful Programs and Files Common
Page 263 and 264:
Page 265 and 266:
Page 267 and 268:
Page 269 and 270:
J Recommended Reading Reference mat
Page 272:
Corporate Headquarters QLogic Corpo
show all

QLogic OFED+ Host Software User Guide, Rev. B

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?