06.08.2013 Views

Abstract

Abstract

Abstract

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 3. BIFURCATION ANALYSIS 58<br />

To handle the nonlinearity in computing W ( f), each processor computed the elec-<br />

tron density n(x) (1.11) for each x-point it owned. The processors would then send<br />

theirpartofn(x) to one processor, which is designated the main processor. This<br />

main processor takes all of n(x), performs the Poisson solve to compute the potential<br />

energy U(x) (1.13), and sends out U(x) to all the processors. Once each processor<br />

had U(x), it could compute P ( f) for each of its x-points. To compute the derivative<br />

term in K( f), each processor would need to know the f values on the 2 x grid points<br />

before its smallest x-point and the 2 x grid point ahead of its largest x-point since a<br />

second-order upwind differencing scheme was used. These values were passed between<br />

the processors. Finally, each processor would add these terms up to get the W ( f)<br />

evaluation on the parts of the domain the processor owned.<br />

To demonstrate the parallel efficiency of our program, the simulation with Nx =<br />

512 and Nk = 2048 was run using from 2 up to 80 processors. The runs reported<br />

in this section were performed on processors of a Linux cluster at Sandia National<br />

Laboratories. This cluster has a total of 236 compute nodes. The nodes are dual<br />

3.06 GHz Xeon processors, each with 2 GB of RAM. The table below compares the<br />

run times for taking 5 continuation steps, from V =0.2093 to V =0.2293. Since<br />

the nodes used to perform the efficiency study are dual processor, we decided a fair<br />

evaluation of the efficiency required a base case of 2 processors instead of the normal<br />

1 processor. The communication between 2 processors on the same node would be<br />

more efficient than the communication between processors across distinct nodes.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!