12.02.2013 Views

Salman Habib - Los Alamos National Laboratory

Salman Habib - Los Alamos National Laboratory

Salman Habib - Los Alamos National Laboratory

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

• Use P^3M<br />

‣ Grid lives on Opteron layer with FFT Poissonsolves<br />

of up to 10,000^3, uses digital filtering<br />

and Super-Lanczos differentiation to reduce<br />

particle-grid interaction<br />

‣ Particles live in Cell BE and interact on “subgrid<br />

scales” by fast hand-coded routines<br />

‣ Only simple grid info flows between Cells and<br />

Opterons, so thin pipe problem is solved<br />

• Avoid Particle Communication<br />

‣ Particle communication between Cells at<br />

every short-range step would be too slow,<br />

avoid this using particle caching<br />

‣ Intermittent nearest neighbor refresh (fast)<br />

• On the Fly Analysis<br />

‣ Avoid I/O as much as possible, analysis<br />

routines must run as a part of the code Overload Zone (particle “cache”)<br />

Nodal domain contains all particles that will ever cross into its reference volume (30% overhead can be reduced to few %). Particles<br />

inside reference volume are “active”, i.e., used to compute self-consistent force, others are “passive”, i.e., used only as tracers (their<br />

“active” self belongs to a different compute node); as they move, particles change their state based on local information. For the PM<br />

piece, overloading is “exact”, for the N^2 part, error at the domain edge propagates inwards. But it is (i) small, and (ii) can be controlled<br />

by increasing Katrin Heitmann, the boundary <strong>Los</strong> <strong>Alamos</strong> layer <strong>National</strong> thickness <strong>Laboratory</strong> or via periodic refreshes.<br />

LBNL, March 12, 2009

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!