PNNL-13501 - Pacific Northwest National Laboratory
PNNL-13501 - Pacific Northwest National Laboratory
PNNL-13501 - Pacific Northwest National Laboratory
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
week on a 1 TFLOP/s computer with 1 gigabyte of<br />
memory per processor. This leads to about<br />
1,500 basis functions without symmetry and<br />
3,000 with full symmetry. Larger calculations are<br />
possible.<br />
• Direct algorithm based upon the sequential semidirect<br />
algorithm of Koch et al. (1994, 1996). Only<br />
the amplitudes and iterative solution vectors are<br />
stored on disk. The AO/SO integrals are recomputed<br />
as necessary each CCSD iteration or for each pass of<br />
the triples code. If sufficient memory/disk space is<br />
available, the atomic orbital integrals can be cached.<br />
• Many new capabilities were developed that can be<br />
reused by other applications within NWChem<br />
(integrals over symmetry adapted orbitals).<br />
• The framework of the code is extensible to support<br />
eventual integration of the Laplace triples and local<br />
correlation methods.<br />
• Detailed performance models support scaling of the<br />
most expensive triples component to 10,000<br />
processors with an expected efficiency of 99%.<br />
Much effort has gone into achieving this since the<br />
code does a large volume of communication. Some<br />
of the techniques used are pre-sorting and blocking to<br />
improve memory-locality and to bundle<br />
communications, full dynamic load balancing, and<br />
randomization to avoid communication hot spots.<br />
The CCSD calculation will scale to a similar number<br />
of processors, but not with such high efficiency<br />
(perhaps 80% to 90%); however the CCSD portion<br />
typically costs a factor of 5 to 10 less.<br />
• Efficient execution on both massively parallel<br />
computers and workstation clusters by avoiding<br />
excess network traffic and reducing sensitivity to<br />
latency.<br />
• A new accelerated inexact Newton algorithm has<br />
been formulated for solving the CCSD equations.<br />
References<br />
Corchado JC, Y-Y Chuang, PL Fast, J Villà, EL Coitiño,<br />
W-P Hu, Y-P Liu, GC Lynch, KA Nguyen, CF Jackels,<br />
MZ Gu, I Rossi, S Clayton, VS Melissas, R Steckler,<br />
BC Garrett, AD Isaacson, and DG Truhlar. 1998.<br />
POLYRATE-version 7.9.1. University of Minnesota,<br />
Minneapolis.<br />
110 FY 2000 <strong>Laboratory</strong> Directed Research and Development Annual Report<br />
Kendall RA, E Aprà, DE Bernholdt, EJ Bylaska,<br />
M Dupuis, GI Fann, RJ Harrison, J Ju, JA Nichols,<br />
J Nieplocha, TP Straatsma, TL Windus, and AT Wong.<br />
2000. “High performance computational chemistry; an<br />
overview of NWChem a distributed parallel application.”<br />
Computer Physics Communications 128, 260.<br />
Koch H, O Christiansen, R Kobayashi, P Jorgensen, and<br />
T Helgaker. 1994. “A direct atomic orbital driven<br />
implementation of the coupled cluster singles and doubles<br />
(CCSD) model.” CPL 228 233.<br />
Koch H, A Sanchez de Meras, T Helgaker, and<br />
O Christiansen. 1996. “The integral-direct coupled<br />
cluster singles and doubles model.” JCP 104 4157.<br />
Steckler R, W-P Hu, Y-P Liu, GC Lynch, BC Garrett,<br />
AK Isaacson, D-h Lu, VS Melissas, TN Truong, SN Rai,<br />
GC Hancock, JG Lauderdale, T Joseph, and DG Truhlar.<br />
1995. POLYRATE-version 6.5. Computer Physics<br />
Communications 88, 341-343.<br />
Presentations<br />
Nichols JA, DK Gracio, and J Nieplocha. 1999. “The<br />
challenge of developing, supporting and maintaining<br />
scientific simulation software for massively parallel<br />
computers.” Germantown.<br />
Seminar, “The Challenge of Developing, Supporting and<br />
Maintaining Scientific Simulation Software for Massively<br />
Parallel Computers.” San Diego Supercomputer Center,<br />
San Diego, California, January 2000.<br />
Plenary lecture, “NWChem: A MPP Computational<br />
Chemistry Software Developed for Specific Application<br />
Targets.” Spring 2000 Meeting of the Western States<br />
Section of the Combustion Institute, Colorado School of<br />
Mines, Golden, Colorado, March 2000.<br />
FLC Awards Presentation, Charleston, South Carolina,<br />
May 2000.<br />
Invited lecture, “New Molecular Orbital Methods within<br />
NWChem,” at the conference “Molecular Orbital Theory<br />
for the New Millennium - Exploring New Dimensions<br />
and Directions of the Molecular Orbital Theory.” Institute<br />
of Molecular Science, Okazaki, Japan, January 2000.<br />
“High Performance Computing in the Environmental<br />
Molecular Science <strong>Laboratory</strong>.” Real World Computing<br />
Institute, Tsukuba, Japan, January 2000.