12.07.2015 Views

Quantum Chemistry with GAMESS - Materials Computation Center

Quantum Chemistry with GAMESS - Materials Computation Center

Quantum Chemistry with GAMESS - Materials Computation Center

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Quantum</strong> <strong>Chemistry</strong><strong>with</strong> <strong>GAMESS</strong>Brett M. BodeScalable Computing LaboratoryDepartment of Electrical and Computer EngineeringIowa State University


OutlineIntroduction to <strong>GAMESS</strong><strong>GAMESS</strong> history<strong>GAMESS</strong> capabilitiesNovel capabilitiesRunning <strong>GAMESS</strong>2


<strong>GAMESS</strong>General Atomic and Molecular ElectronicStructure SystemGeneral purpose electronic structure codePrimary focus is on ab initio quantum chemistrycalculationsAlso can doDensity functional calculationsOther semi-empirical calculations (AM1, PM3)QM/MM calculationsIts free and in wide use on everything from laptops tosupercomputers.3


Obtaining <strong>GAMESS</strong>Its free, but not “Open source” in the normal sense.Group license: You get the source and can do anythingyour want <strong>with</strong> it, except distribute it.See http://wwwmsg.fi.ameslab.gov/<strong>GAMESS</strong>/ for moreinformation and the registration page link.Distribution is source code, <strong>with</strong> pre-built binaries alsoavailable for Macintosh, Linux and Windows.Full manual also on web site. See section 2 for completekeyword description for the input file, section 4 forreferences for all of the methods.4


<strong>GAMESS</strong> People<strong>GAMESS</strong> is a product of Dr.Mark Gordon’s research groupat Iowa State University.Dr. Mike Schmidt coordinatesthe development efforts and isthe gatekeeper for the code.5


<strong>GAMESS</strong> HistoryThe code base was began in 1980 from partsof other codes. Some code still goes back tothat version!Currently stands at about 750000 lines ofmostly Fortran 77 compatible code.Pretty much runs on any system <strong>with</strong> aworking Fortran compiler.6


<strong>GAMESS</strong> ParallelizationBegan in 1991 <strong>with</strong> the parallelization of theSCF Energy and Gradient computations(almost trivially parallel).Initial parallel work done on the TouchStoneDelta.In 1996 the Distributed Data Interface (DDI)was developed to support the new parallelMP2 energy and gradient code.7


<strong>GAMESS</strong> ParallelizationIn 2004 DDI was rewritten andoptimizations for SMP using SystemVshared memory were added. Focus remainsdistributed memory systems!Also added was subgroup support to enablethe Fragment Molecular Orbital method.8


DDI<strong>with</strong> high performance and potentially intelligentinterconnect networks like Gigabit Ethernet [9], Myrinet[10], SCI [11], or Infiniband [12]. A similar trend isalso evident in dedicated supercomputers, where, forexample large scale IBM SP and HP SC systems nowuse SMP nodes. Indeed, very large shared memorycomputers, like the SGI Origin 3000 or HP GS, usuallyhave Non-Uniform Memory Access (NUMA)architectures that can be viewed as a cluster of uniformmemory SMPs linked via a network, albeit a very goodnetwork.With this move away from single processor towardmulti-processor based clusters we are confronted <strong>with</strong> aconsiderably more complicated memory model than thatwhich was present when either DDI or GA wereoriginally conceived. Now small groups of processeshave equally fast access to chunks of memory, whileaccessing memory between groups of processes isslower. Recognizing this plus the success and popularityof these programming models, it is pertinent to considerhow these models might be extended to better exploitSMP clusters. The aim of this paper is to begin toaddress this issue, presenting an enhanced version ofDDI that includes new functionality specificallytargeting SMP clusters. Using both the new and originalversions of DDI, performance results are presented anddiscussed for a typical <strong>GAMESS</strong> computation run on avariety of MPP systems. First, however, we begin <strong>with</strong> abrief discussion of the existing DDI data server modelModeled on the Global Array Framework.used in <strong>GAMESS</strong>.The Distributed Data Interface provides2a pseudo globalshared memory interface for a portion of a nodes memory.Normal MPI version uses 2 processes per processor, 1compute, 1 data server.Sockets are used for interrupts on data servers becauseMPI often polls in receive.SHMEM and LAPI versions also available...Also provides processor subgroup support.memory is the memory reserved by all the remainingparallel processes for their portions of the distributeddata. Every process in a parallel job is allowed toaccess/modify any element in the distributed memorysegment (regardless of its physical location); however,access to local distributed-memory is assumed to befaster than access to remote distributed-memory. Thusthe DDI programming strategy aims to maximize theuse of local distributed data while minimizing remotedata requests. Note that the performance penalty foraccessing distributed-memory (local or remote) iscompletely dependent on the underlying machineFigure 1: The virtual shared-memory model. Each large box(grey) represents the memory available to a given CPU. Theinner boxes represent the memory used by the parallel processes(rank in lower right). The gold region depicts the memoryreserved for the storage of distributed data. The arrows indicatememory access (through any means) for the distributedoperations: get, put and accumulate.9


Program CapabilitiesTypes of wavefunctionsHartree-Fock (RHF, ROHF, UHF, GVB)CASSCFCI, MRCICoupled cluster methods (closed shells)Second order perturbation theoryMP2 (closed shells)ROMP2 (spin-correct open shells)UMP2 (unrestricted open shellsMCQDPT(CASSCF - MRMP2)Localized orbitals (SCF, MCSCF)10


Program CapabilitiesEnergy-related propertiesTotal energy as function of nuclear coordinates (PES):All wavefunction typesAnalytic energy gradientRHF, ROHF, UHF, MCSCF, CI, MP2, UMP2,DFTROMP2 in progressAnalytic hessianRHF, ROHF, TCSCF/GVBMCSCF just completed11


Program CapabilitiesEnergy-related properties (cont’d)Numerical hessians from finite differences of analyticgradientsFully numerical derivatives for all methodsSaddle point (TS) search (requires hessian)Minimum energy path=Intrinsic Reaction CoordinateSeveral IRC options - GS2 is most effectiveRequires frequency input, gradients along pathFollow reaction path from reactants through TS toproductsBuild reaction path Hamiltonian (RPH):dynamics12


Program CapabilitiesEnergy-related properties (cont’d)Dynamic reaction coordinate (DRC)Add kinetic energy to system at any geometryAdd photon(s) to any vibrational modeClassical trajectory using QM-derived energiesRequires gradientsMonte Carlo sampling: find global minimumMolecular dynamics (in progress)13


Program CapabilitiesOther functionalitiesSpin-orbit couplingAny spin states, any number of statesFull two-electron Breit-PauliPartial two-electron (P2e)-very efficient, accurateSemi-empirical one-electron ZeffRESCAveraging over vibrational statesDerivative (vibronic) coupling: planned14


Program CapabilitiesInterpretive toolsLocalized molecular orbitals (LMO)Localized charge distributions (LCD)Nuclear and spectroscopic propertiesSpin densities at nucleus (ESR)NMR spin-spin couplings (in progress)NMR chemical shiftsPolarizabilities, hyperpolarizabilitiesIR and Raman intensitiesTransition probabilities, Franck-Condon overlaps15


Program CapabilitiesQM/MM MethodsEffective fragment potential (EFP) method forCluster studies of liquidsCluster studies of solvent effectsInterfaced <strong>with</strong> continuum methods for study ofliquids and solvation in bulkCovalent link for study of enzymes, proteins,materialsSIMOMM: QM/MM method for surface chemistryQM part can be any method in <strong>GAMESS</strong>MM part from Tinker (Jay Ponder)16


Current CapabilitiesSCF TypeRun Type RHF ROHF UHF GVB MCSCFEnergy CDFP CDP CDP CDP CDFPAnalytic Gradient CDFP CDP CDP CDP CDFPNumerical Hessian CDP CDP CDP CDP CDPAnalytic Hessian CDP CDP - CDP CDPMP2 energy CDFP CDP CDP - CPMP2 gradient CDFP DP CDP - -CC Energy CDF - - - -EOMCCCDCI energy CDP CDP - CDP CDPCI gradient CD - - - -DFT energy CDFP CDP CDP - -DFT gradient CDFP CDP CDP - -MOPAC Energy yes yes yes yes -MOPAC gradient yes yes yes - -C= conventional storage of AO integrals on diskD= direct evaluation of AO integralsF= Fragment Molecular Orbital enabledP= parallel execution17


SolvationSolvation MethodsExplicit vs. implicit methodsExplicit MethodsTIP3P, TIP4PSPC, SPC/EEFP Method for SolvationSummary of EFP1 method for waterGeneralized EFP Method (EFP2)18


General Effective Fragment PotentialDiscrete solvation methodFragment potential is one electron contribution to the abinitio HamiltonianPotentialsare obtained by separate ab initio calculationsdepend on properties of isolated moleculescan be systematically improved19


General Effective Fragment PotentialEffective Fragment PotentialSystem is divided intoSystem is divided intoan ab initio region region for the for “solute” the “solute” and anda fragment region for the for solvent the solvent molecules. molecules.E = Eab initio + Einteraction1020


Hartree Fock based EFPInteraction energy consists of : electrostatic, polarizationand exchange repulsion/charge transfer termEinteraction= Ecoulomb + Epolarization + Eexchange repulsion/charge transferKE interaction= ! V Elec k(µ,s) + V Pol Re! l(µ,s) + ! V p m(µ, s)k =1Hartree Fock based EFPLl =1Mm =1DistributedMultipolar expansionLMO polarizabilityexpansionFit to FunctionalForm1221


EFP resultsgOH(r):EFP1/HF, EFP1/DFT, SPC/E62 waters1.81.61.41.2gOH(r)10.80.60.4EFP1/HFEFP1/DFTSPC/EExp (ND)0.200 1 2 3 4 5 6 7r (Angstroms)Exp (ND): Neutron Diffraction; Soper et. al.2222


Generalized EFP2 MethodGeneralized EFP2 MethodInteraction energy consists of : electrostatic, polarizationand exchange repulsion termEinteraction= Eelectostatic + Epolarization + Eexchange repulsionDistributedMultipolar expansionLMO polarizabilityexpansionFrom first principlesusing LMO overlaps2623


EFP PerformanceEFP Performance:Energy + Gradient CalculationEnergy + Gradient CalculationMethod 120 watermolecules62 watermolecules122 watermolecules512 watermoleculesAb initio 2 3.19 hrs --- --- ~157 yrs 3EFP2 3.3 sec 26.1 sec 95.3 sec 26.8 minEFP1/HF 0.2 sec 2.6 sec 5.1 sec 97.8 secSPC/E 4 0.02 sec 0.02 sec 0.1 sec 0.7 sec1Run on 1200 MHz Athlon/Linux machine2Ab initio: DZP basis set, 3 Assuming N 4 scaling,304SPC/E = Simple Extended Point Charge model24


Fragment Molecular OrbitalDivide up the system intofragmentsIgnore exchange andself-consistency due toother fragmentsDo ab initio calculationsof fragments in theCoulomb field due tothe whole system.Likewise, compute pairsand triples of fragments.Basic ideaDivide work by the Kitaura, system Ishida into and fragments.For each fragment, Federov at ignore AIST exchange and sconsistency due to other fragments but retainCoulomb field. Otherwise, do ab initio calcula25fragments and their n-mers.


FMO FeaturesNo hydrogen caps.All n-mer calculations are ab initio.Interfragment charge transfer, dispersion and exchange areincluded.Systematic many-body effects.Total properties closely reproduce ab initio values.No fitted parameters.26


FMOCan also add in electron correlation.MP2Coupled ClusterDFTMCSCFCan be multilayer - ie MCSCF for activesite, RHF everywhere else.27


!FMO-MCSCF!"#$"%&%!'"#$!%&#&%$'!()!*$+(#$*!,)!-./.01!!"23$'!%&#&%$')!,'$!4501!!6(%$')!(#789*(#:!23$!-./.0! !%&#&%$'!,'$!-./.0!;'$*!8(#$)


Applications of FMO!""#$%&'$()*+(,+-./+!01(21'34+("'$2$*&'$()!"!#$%%%!&'()*+,!!!"#$%$&$'!-./.-!*$)5#1+"($)'+1)135$1*!"!#%$%%%!&'()*+,!!6&$3+$)'13&%'$()+&)&#4*$*!"!#%$%%%!&'()*+0!! 1234!1.*546$!! -54&61!1(78564$!! 9(-:).2!7;.)5*'2:$!! )(-.73-&2!7-3*'.2*,!! !29


FMO results!"#$%"&'()*&+,%-.'!"#$%$&'"$(")$)*+'!! !/01234/562789:;'?')@"-,'@$'6AA'1B%*-@$,'=A;?C8'#%@&,;'


Running <strong>GAMESS</strong><strong>GAMESS</strong> runs onAny Unix-based system available in the U.S.Any Linux based systemAny MacintoshWindows based system using Win<strong>GAMESS</strong> orPC<strong>GAMESS</strong>Obtained from www.msg.ameslab.gov


<strong>GAMESS</strong><strong>GAMESS</strong> is a back-end program, ie no GUI.Typically it is run via a scriptInput is taken from a file (usually .inp)Output appears in .log file (stdout)This is intended to be human readableMO Vectors, coordinates, hessians, etc appearin .dat file. Can be used for restarts.IRC and DRP data and numerical hessian restartinformation appear in .irc file.These are all ASCII text files.


<strong>GAMESS</strong> Input fileInput files are modular, arranged in $groupsMost common input groups$SYSTEM: specifies memory, time limit$CONTRL: specifies basics of calculation$BASIS: specifies basis set if standard$DATA: specifies nuclear coordinates, basis set if nonstandardOther important groups:$GUESS, $SCF, $FORCE, $HESS, $VEC, $IRC, $VIB


<strong>GAMESS</strong> Input fileThe input file is mostly free-format (ie flexiblespacing) except:‘$’ sign specifying group must be in column 2!All groups must terminate <strong>with</strong> a $END (this ‘$’can be anywhere except column 1).anything in column 1 indicates a comment line


Some key groups$SYSTEM group:TIMLIM=(default=525600 min = 1 yr)MWORDS=(default=1=8MB)MEMDDI=relevant for parallel runTotal required memory (divide by number ofprocessors to get memory requested/node)


Some key groups$CONTRLICHARG= (specifies charge on system)MULT= (specifies spin multiplcity)1 for singlet, 2 for doublet, ...EXETYP=Check: checks input for errorsRun: actual runUNITS=angs (default)bohr


Some key groups$CONTRLRuntyp= (type of run)Energy (single point energy run)Gradient (energy 1st derivative wrt coordinates)Optimize (optimize geometry)Hessian (energy second derivative, vibrational frequencies,thermodynamic properties):generates $HESS group in .dat file)Sadpoint (saddle point search:requires hessian in $HESSgroup)IRC (performs IRC calculation: usually requires $IRC group,$HESS group)


Some key groups$CONTRLscftyp= (type of wavefunction)RHFROHFUHFMCSCFGVBmplevl=0 (default, no perturbation theory)2 (MP2: valid for RHF, ROHF, MCSCF)


Some key groups$BASIS - Used to select among the built-in basis setsGBASIS=STON21N31TZV...NGAUSS=(# gaussians for STO, N21, N31)NDFUNC=(# sets of d’s on heavy atomsNPFUNC=(# sets of p’s on hydrogens)NFFUNC=(# sets of f’s on TM’s)DIFFSP=.T. (diffuse sp functions on heavy atoms)DIFFS=.T. (diffuse s functions on hydrogens)


Some key groups$Data - Gives the molecular geometryTitle line (will be printed in output)Symmetry groupC1CSCNV 2 (C2V), ...Blank line except C1Symbol Z xcoord ycoord zcoordSymbol = atomic symbolZ = atomic numberxcoord,ycoord, zcoord = Cartesian coordinatesInternal coordinates is another option


Some key groups$Data - continuedRepeat this line for each symmetry unique atom (see below)Need to specify basis set after each coordinate line if $BASIS is notpresentsymmetry unique atomsH2O: O and 1 HNH3: N and 1 Hsaves CPU time (e.g., numerical hessians only displace symmetryunique atoms)Need to follow conventions in <strong>GAMESS</strong> manualCs, Cnh: plane is XYCnv: axis is ZFor Cinfv, use C4vFor Dinfh, use D4h


Some key groups$GUESS - Initial MO guessBuilt-in guess (default) works much of the time$GUESS=MOREAD NORB=xx $ENDRequires $VEC group (usually from .dat file)NORB=# MO’s to be read inUseful when SCF convergence is difficultNecessary for MCSCF, CI


<strong>GAMESS</strong> outputThe log file output is intended to be human readable:--------------------------RHF SCF CALCULATION--------------------------NUCLEAR ENERGY = 8.9064898741MAXIT = 30 NPUNCH= 2EXTRAP=T DAMP=F SHIFT=F RSTRCT=F DIIS=F DEM=F SOSCF=FDENSITY MATRIX CONV= 1.00E-05MEMORY REQUIRED FOR RHF STEP= 30441 WORDS.ITER EX DEM TOTAL ENERGY E CHANGE DENSITY CHANGE DIIS ERROR1 0 0 -74.7936151096 -74.7936151096 .595010038 .0000000002 1 0 -74.9519661838 -.1583510742 .180249713 .000000000...11 6 0 -74.9659012167 -.0000000014 .000018538 .00000000012 7 0 -74.9659012170 -.0000000003 .000008228 .00000000013 8 0 -74.9659012171 -.0000000001 .000003650 .000000000-----------------DENSITY CONVERGED-----------------TIME TO FORM FOCK OPERATORS= .0 SECONDS ( .0 SEC/ITER)TIME TO SOLVE SCF EQUATIONS= .0 SECONDS ( .0 SEC/ITER)FINAL RHF ENERGY IS-74.9659012171 AFTER 13 ITERATIONS


The Dat fileThe dat file contains formatted numerical data.Useful, sometimes required for restarts.Contains items such as:MO Vectors ($VEC)Gradient ($GRAD) and Hessian ($HESS)When copying a group make sure you copyeverything from the beginning $ sign through thecorresponding $END.


<strong>GAMESS</strong> outputYou will need to look at the log file toverify the results.Did the run finish correctly?Was the input specified correctly?Were there errors in thecomputation?


Running <strong>GAMESS</strong>You frequently need the results from one run asinput to another run.restarting incomplete runsMulti step problemsA Saddle point search might take severaloptimization and hessian computationsfollowed by IRC computations.Multi-reference computations often multipleruns to get the orbital guess correct.


VisualizationA number of programs can visualize<strong>GAMESS</strong> results to varying degrees.MacMolPlt is one such program that hasbeen specifically designed for visualizing<strong>GAMESS</strong> output.


DemoThis afternoon I will present a demo ofrunning <strong>GAMESS</strong> and using MacMolPlt.


AcknowledgmentsMark GordonDmitri Federovthe rest of the Gordon group in Ames


Financial SupportAir Force Office of Scientific ResearchNational Science FoundationDoD CHSSI Software DevelopmentDOE SciDAC ProgramAmes LaboratoryDoD HPC Grand Challenge Program51

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!