PETSc Tutorial - The ACTS Toolkit
PETSc Tutorial - The ACTS Toolkit
PETSc Tutorial - The ACTS Toolkit
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>PETSc</strong> <strong>Tutorial</strong><strong>PETSc</strong> TeamPresented by Matthew KnepleyMathematics and Computer Science DivisionArgonne National Laboratory<strong>ACTS</strong> WorkshopLawrence Berkeley National LaboratoryAugust 23, 2006M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 1 / 166
Units1Getting Started with <strong>PETSc</strong>2Common <strong>PETSc</strong> Usage3<strong>PETSc</strong> Essentials4<strong>PETSc</strong> Integration5Advanced <strong>PETSc</strong>6<strong>PETSc</strong> Extensibility7<strong>PETSc</strong> Optimization8Future PlansM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 2 / 166
Part IGetting Started with <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 3 / 166
Unit ObjectivesWhat is <strong>PETSc</strong>?Introduce thePortable Extensible <strong>Toolkit</strong> for Scientific ComputationRetrieve, Configure, Build, and Run a <strong>PETSc</strong> ExampleEmpower students to learn more about <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 4 / 166
What is <strong>PETSc</strong>?What I Need From YouTell me if you do not understandTell me if an example does not workSuggest better wording or figuresFollowup problems at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 5 / 166
What is <strong>PETSc</strong>?How Can We Help?Provide documentationAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166
What is <strong>PETSc</strong>?How Can We Help?Provide documentationQuickly answer questionsAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166
What is <strong>PETSc</strong>?How Can We Help?Provide documentationQuickly answer questionsHelp installAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166
What is <strong>PETSc</strong>?How Can We Help?Provide documentationQuickly answer questionsHelp installGuide large scale code developmentAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166
What is <strong>PETSc</strong>?<strong>The</strong> Role of <strong>PETSc</strong>Developing parallel, nontrivial PDE solvers that deliverhigh performance is still difficult and requiresmonths (or even years) of concentrated effort.<strong>PETSc</strong> is a toolkit that can ease these difficultiesand reduce the development time, but it is not ablack-box PDE solver, nor a silver bullet.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 7 / 166
What is <strong>PETSc</strong>?What is <strong>PETSc</strong>?A freely available and supported research codeDownload from http://www.mcs.anl.gov/petscFree for everyone, including industrial usersHyperlinked manual, examples, and manual pages for all routinesHundreds of tutorial-style examplesSupport via email: petsc-maint@mcs.anl.govUsable from C, C++, Fortran 77/90, and PythonM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 8 / 166
What is <strong>PETSc</strong>?What is <strong>PETSc</strong>?Portable to any parallel system supporting MPI, including:Tightly coupled systemsCray T3E, SGI Origin, IBM SP, HP 9000, Sub EnterpriseLoosely coupled systems, such as networks of workstationsCompaq,HP, IBM, SGI, Sun, PCs running Linux or Windows<strong>PETSc</strong> HistoryBegun September 1991Over 8,500 downloads since 1995 (version 2), currently 250 per month<strong>PETSc</strong> Funding and SupportDepartment of EnergySciDAC, MICS ProgramNational Science FoundationCIG, CISE, Multidisciplinary Challenge ProgramM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 9 / 166
What is <strong>PETSc</strong>?TimelineActive Developers654321<strong>PETSc</strong> 1 release+Barry+Bill<strong>PETSc</strong> 2 release+LoisBill+Satish+Hong+KrisLois+Matt+VictorKris1991 1993 1995 1996 2000 2001 2003 2005M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 10 / 166
What is <strong>PETSc</strong>?What Can We Handle?<strong>PETSc</strong> has run problems with over 500 million unknownshttp://www.scconference.org/sc2004/schedule/pdfs/pap111.pdf
What is <strong>PETSc</strong>?What Can We Handle?<strong>PETSc</strong> has run problems with over 500 million unknownshttp://www.scconference.org/sc2004/schedule/pdfs/pap111.pdf<strong>PETSc</strong> has run on over 6,000 processors efficientlyftp://info.mcs.anl.gov/pub/tech reports/reports/P776.ps.Z
What is <strong>PETSc</strong>?What Can We Handle?<strong>PETSc</strong> has run problems with over 500 million unknownshttp://www.scconference.org/sc2004/schedule/pdfs/pap111.pdf<strong>PETSc</strong> has run on over 6,000 processors efficientlyftp://info.mcs.anl.gov/pub/tech reports/reports/P776.ps.Z<strong>PETSc</strong> applications have run at 2 TeraflopsLANL PFLOTRAN codeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 11 / 166
Who uses and develops <strong>PETSc</strong>?Who Uses <strong>PETSc</strong>?Computational ScientistsPyLith (TECTON), Underworld, Columbia groupAlgorithm DevelopersIterative methods and Preconditioning researchersPackage DevelopersSLEPc, TAO, MagPar, StGermain, DealIIM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 12 / 166
Who uses and develops <strong>PETSc</strong>?<strong>The</strong> <strong>PETSc</strong> TeamHong Zhang Victor Eijkhout Lois McInnesM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 13 / 166Bill Gropp Barry Smith Satish BalayDinesh Kaushik Kris Buschelman Matt Knepley
How can I get <strong>PETSc</strong>?Downloading <strong>PETSc</strong><strong>The</strong> latest tarball is on the <strong>PETSc</strong> siteftp://ftp.mcs.anl.gov/pub/petsc/petsc.tar.gzWe no longer distribute patches (everything is in the distribution)<strong>The</strong>re is a Debian package<strong>The</strong>re is a FreeBSD Port<strong>The</strong>re is a Mercurial development repositoryM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 14 / 166
Cloning <strong>PETSc</strong>How can I get <strong>PETSc</strong>?<strong>The</strong> full development repository is open to the publichttp://mercurial.mcs.anl.gov/petsc/petsc-devhttp://mercurial.mcs.anl.gov/petsc/BuildSystemWhy is this better?You can clone to any release (or any specific ChangeSet)You can easily rollback changes (or releases)You can get fixes from us the same dayM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 15 / 166
How can I get <strong>PETSc</strong>?Unpacking <strong>PETSc</strong>Just clone development repositoryhg clone http://mercurial.mcs.anl.gov/petsc/petsc-devpetsc-devhg clone -rRelease-2.3.1 petsc-dev petsc-2.3.1orUnpack the tarballtar xzf petsc.tar.gzM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 16 / 166
Exercise 1How can I get <strong>PETSc</strong>?Download and Unpack <strong>PETSc</strong>!M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 17 / 166
How do I Configure <strong>PETSc</strong>?Configuring <strong>PETSc</strong>Set $PETSC DIR to the installtion root directoryRun the configuration utility$PETSC DIR/config/configure.py$PETSC DIR/config/configure.py --help$PETSC DIR/config/configure.py --download-mpich<strong>The</strong>re are many examples on the installation pageConfiguration files are placed in $PETSC DIR/bmake/$PETSC ARCH$PETSC ARCH has a default if not specifiedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 18 / 166
How do I Configure <strong>PETSc</strong>?Configuring <strong>PETSc</strong>You can easily reconfigure with the same options./bmake/$PETSC ARCH/configure.pyCan maintain several different configurations./config/configure.py -PETSC ARCH=linux-fast--with-debugging=0All configuration information is in configure.logALWAYS send this file with bug reportsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 19 / 166
How do I Configure <strong>PETSc</strong>?Automatic DownloadsStarting in 2.2.1, some packages are automaticallyDownloadedConfigured and Built (in $PETSC DIR/externalpackages)Installed in <strong>PETSc</strong>Currently works for<strong>PETSc</strong> documentation utilities (Sowing, lgrind, c2html)BLAS, LAPACK, BLACS, ScaLAPACK, PLAPACKMPICH, MPE, LAMParMetis, Chaco, Jostle, Party, ScotchMUMPS, Spooles, SuperLU, SuperLU Dist, UMFPackPrometheus, HYPRE, ML, SPAISundialsTriangle, TetGenM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 20 / 166
Exercise 2How do I Configure <strong>PETSc</strong>?Configure the <strong>PETSc</strong> that you downloaded and unpacked.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 21 / 166
Building <strong>PETSc</strong>How do I Build <strong>PETSc</strong>?Uses recursive make starting in cd $PETSC DIRmakeCheck build when done with make testComplete log for each build in make log $PETSC ARCHALWAYS send this with bug reportsCan build multiple configurationsPETSC ARCH=linux-fast makeLibraries are in $PETSC DIR/lib/$PETSC ARCH/Can also build a subtreecd src/snes; makecd src/snes; make ACTION=libfast treeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 22 / 166
Exercise 3How do I Build <strong>PETSc</strong>?Build the <strong>PETSc</strong> that you configured.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 23 / 166
Exercise 4How do I Build <strong>PETSc</strong>?Reconfigure <strong>PETSc</strong> to use ParMetis.1 ./bmake/linux-gnu/configure.py-PETSC ARCH=linux-parmetis-download-parmetis2 PETSC ARCH=linux-parmetis make3 PETSC ARCH=linux-parmetis make testM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 24 / 166
How do I run an example?Running <strong>PETSc</strong>Try running <strong>PETSc</strong> examples firstcd $PETSC DIR/src/snes/examples/tutorialsBuild examples using make targetsmake ex5Run examples using the make targetmake runex5Can also run using MPI directlympirun ./ex5 -snes max it 5mpiexec ./ex5 -snes monitorM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 25 / 166
Using MPIHow do I run an example?<strong>The</strong> Message Passing Interface is:a library for parallel communicationa system for launching parallel jobs (mpirun/mpiexec)a community standardLaunching jobs is easympiexec -np 4 ./ex5You should never have to make MPI calls when using <strong>PETSc</strong>Almost neverM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 26 / 166
MPI ConceptsHow do I run an example?CommunicatorA context (or scope) for parallel communication (“Who can I talk to”)<strong>The</strong>re are two defaults:yourself (PETSC COMM SELF),and everyone launched (PETSC COMM WORLD)Can create new communicators by splitting existing onesEvery <strong>PETSc</strong> object has a communicatorPoint-to-point communicationHappens between two processes (like in MatMult())Reduction or scan operationsHappens among all processes (like in VecDot())M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 27 / 166
How do I run an example?Alternative Memory ModelsSingle process (address space) modelOpenMP and threads in generalFortran 90/95 and compiler-discovered parallelismSystem manages memory and (usually) thread schedulingNamed variables refer to the same storageSingle name space modelHPF, UPCGlobal ArraysTitaniumNamed variables refer to the coherent values (distribution is automatic)Distributed memory (shared nothing)Message passingNames variables in different processes are unrelatedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 28 / 166
How do I run an example?Common Viewing OptionsGives a text representation-vec viewGenerally views subobjects too-snes viewCan visualize some objects-mat view drawAlternative formats-vec view binary, -vec view matlab, -vec view socketSometimes provides extra information-mat view info, -mat view info detailedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 29 / 166
How do I run an example?Common Monitoring OptionsDisplay the residual-ksp monitor, graphically -ksp xmonitorCan disable dynamically-ksp cancelmonitorsDoes not display subsolvers-snes monitorCan use the true residual-ksp truemonitorCan display different subobjects-snes vecmonitor, -snes vecmonitor update,-snes vecmonitor residual-ksp gmres krylov monitorCan display the spectrum-ksp singmonitorM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 30 / 166
Exercise 5How do I run an example?Run SNES Example 5 using come custom options.1 cd $PETSC DIR/src/snes/examples/tutorials2 make ex53 mpiexec ./ex5 -snes monitor -snes view4 mpiexec ./ex5 -snes type tr -snes monitor -snes view5 mpiexec ./ex5 -ksp monitor -snes monitor -snes view6 mpiexec ./ex5 -pc type jacobi -ksp monitor -snes monitor-snes view7 mpiexec ./ex5 -ksp type bicg -ksp monitor -snes monitor-snes viewM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 31 / 166
Exercise 6How do I run an example?Create a new code based upon SNES Example 5.1 Create a new directorymkdir -p /home/knepley/proj/newsim/src2 Copy the sourcecp ex5.c /home/knepley/proj/newsim/src3 Create a <strong>PETSc</strong> makefileAdd a link target${CLINKER} -o $@ $^ ${PETSC SNES LIB}${FLINKER} -o $@ $^ ${PETSC FORTRAN LIB} ${PETSC SNES LIB}include ${PETSC DIR}/bmake/common/baseM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 32 / 166
How do I get more help?Getting More Helphttp://www.mcs.anl.gov/petscHyperlinked documentationManualManual pages for evey methodHTML of all example code (linked to manual pages)FAQFull support at petsc-maint@mcs.anl.govHigh profile usersDavid KeyesRich MartineauRichard KatzCharles WilliamsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 33 / 166
Part IICommon <strong>PETSc</strong> UsageM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 34 / 166
Debugging <strong>PETSc</strong>Correctness DebuggingAutomatic generation of tracebacksDetecting memory corruption and leaksOptional user-defined error handlersM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 35 / 166
Debugging <strong>PETSc</strong>Interacting with the DebuggerLaunch the debugger-start in debugger [gdb,dbx,noxterm]-on error attach debugger [gdb,dbx,noxterm]Attach the debugger only to some parallel processes-debugger nodes 0,1Set the display (often necessary on a cluster)-display khan.mcs.anl.gov:0.0M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 36 / 166
Debugging TipsDebugging <strong>PETSc</strong>Putting a breakpoint in PetscError() can catch errors as they occur<strong>PETSc</strong> tracks memory overwrites at the beginning and end of arrays<strong>The</strong> CHKMEMQ macro causes a check of all allocated memoryTrack memory overwrites by bracketing them with CHKMEMQ<strong>PETSc</strong> checks for leaked memoryUse PetscMalloc() and PetscFree() for all allocationOption -trmalloc will print unfreed memory on PetscFinalize()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 37 / 166
Exercise 1Debugging <strong>PETSc</strong>Use the debugger to find a SEGVLocate a memory overwrite using CHKMEMQ.Get the examplehg clone -r1.4bk://mercurial.mcs.anl.gov/petsc/tutorialExercise1Build the example makeRun it make run and watch the fireworksRun it under the debugger make debug and correct the errorBuild it and run again make ex1 run to catch the memory overwriteCorrect the error, build it and run again make ex1 runM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 38 / 166
Profiling <strong>PETSc</strong>Performance Debugging<strong>PETSc</strong> has integrated profilingOption -log summary prints a report on PetscFinalize()<strong>PETSc</strong> allows user-defined eventsEvents report time, calls, flops, communication, etc.Memory usage is tracked by objectProfiling is separated into stagesEvent statistics are aggregated by stageM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 39 / 166
Profiling <strong>PETSc</strong>Using Stages and EventsUse PetscLogStageRegister() to create a new stageStages are identifier by an integer handleUse PetscLogStagePush/Pop() to manage stagesStages may be nested and will aggregate in a nested fashionUse PetscLogEventRegister() to create a new stageEvents also have an associated classUse PetscLogEventBegin/End() to manage eventsEvents may also be nested and will aggregate in a nested fashionCan use PetscLogFlops() to log user flopsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 40 / 166
Profiling <strong>PETSc</strong>Adding A Logging Stageint stageNum;ierr = PetscLogStageRegister(&stageNum, "name");CHKERRQ(ierr);ierr = PetscLogStagePush(stageNum);CHKERRQ(ierr);Code to Monitorierr = PetscLogStagePop();CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 41 / 166
Profiling <strong>PETSc</strong>Adding A Logging Eventstatic int USER EVENT;ierr = PetscLogEventRegister(&USER EVENT, "name", CLASS COOKIE);CHierr = PetscLogEventBegin(USER EVENT,0,0,0,0);CHKERRQ(ierr);Code to Monitorierr = PetscLogFlops(user event flops);CHKERRQ(ierr);ierr = PetscLogEventEnd(USER EVENT,0,0,0,0);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 42 / 166
Profiling <strong>PETSc</strong>Adding A Logging Classstatic int CLASS COOKIE;ierr = PetscLogClassRegister(&CLASS COOKIE,"name");CHKERRQ(ierr);Cookie identifies a class uniquelyInitialization must happen before any objects of this type are createdM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 43 / 166
Profiling <strong>PETSc</strong>Matrix Memory Preallocation<strong>PETSc</strong> sparse matrices are dynamic data structurescan add additional nonzeros freelyDynamically adding many nonzerosrequires additional memory allocationsrequires copiescan kill performanceMemory preallocation providesthe freedom of dynamic data structuresgood performanceM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 44 / 166
Profiling <strong>PETSc</strong>Efficient Matrix CreationCreate matrix with MatCreate()Set type with MatSetType()Determine the number of nonzeros in each rowloop over the grid for finite differencesloop over the elements for finite elementsneed only local+ghost informationPreallocate matrixMatSeqAIJSetPreallocation()MatMPIAIJSetPreallocation()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 45 / 166
Profiling <strong>PETSc</strong>Indicating Expected NonzerosSequential Sparse MatricesMatSeqAIJPreallocation(Mat A, int nz, int nnz[])nz: expected number of nonzeros in any rownz(i): expected number of nonzeros in row iM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 46 / 166
Profiling <strong>PETSc</strong>ParallelSparseMatrixEach process locally owns a submatrix of contiguous global rowsEach submatrix consists of diagonal and off-diagonal partsMatGetOwnershipRange(Mat A, int *start, int *end)start: first locally owned row of global matrixend-1: last locally owned row of global matrixM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 47 / 166
Profiling <strong>PETSc</strong>Indicating Expected NonzerosParallel Sparse MatricesMatMPIAIJPreallocation(Mat A, int dnz, int dnnz[], int onz,int onnz[])dnz: expected number of nonzeros in any row in the diagonal blocknz(i): expected number of nonzeros in row i in the diagonal blockonz: expected number of nonzeros in any row in the offdiagonal portionnz(i): expected number of nonzeros in row i in the offdiagonal portionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 48 / 166
Profiling <strong>PETSc</strong>Verifying PreallocationUse runtime option -infoOutput:[proc #] Matrix size: %d X %d; storage space: %dunneeded, %d used[proc #] Number of mallocs during MatSetValues( ) is %dM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 49 / 166
Exercise 2Profiling <strong>PETSc</strong>Return to Part 1:Execise 6 and add more profiling.Run it make profile and look at the profiling reportAdd a new stage for setupAdd a new event for FormInitialGuess() and log the flopsRun it again make ex5 profile and look at the profiling reportM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 50 / 166
Part III<strong>PETSc</strong> EssentialsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 51 / 166
<strong>PETSc</strong> principles and design<strong>PETSc</strong> StructureM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 52 / 166
<strong>PETSc</strong> principles and designFlow Control for a <strong>PETSc</strong> ApplicationMain RoutineTimestepping Solvers (TS)Nonlinear Solvers (SNES)Linear Solvers (KSP)Preconditioners (PC)<strong>PETSc</strong>ApplicationInitializationFunctionEvaluationJacobianEvaluationPostprocessingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 53 / 166
<strong>PETSc</strong> principles and designLevels of AbstractionIn Mathematical SoftwareApplication-specific interfaceProgrammer manipulates objects associated with the applicationHigh-level mathematics interfaceProgrammer manipulates mathematical objectsWeak forms, boundary conditions, meshesAlgorithmic and discrete mathematics interfaceProgrammer manipulates mathematical objectsSparse matrices, nonlinear equationsProgrammer manipulates algorithmic objectsSolversLow-level computational kernelsBLAS-type operations, FFTM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 54 / 166
<strong>PETSc</strong> principles and designObject-Oriented DesignDesign based on operations you perform,not on the data in the objectExample: A vector isnot a 1d array of numbersan object allowing addition and scalar multiplication<strong>The</strong> efficient use of the computer is an added difficultyM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 55 / 166
<strong>PETSc</strong> principles and designSymmetry PrincipleInterfaces to mutable data must be symmetric.Creation and query interfaces are paired“No get without a set”Fairness“If you can do it, your users will want to do it”Openness“If you can do it, your users will want to undo it”M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 56 / 166
<strong>PETSc</strong> principles and designEmpiricism PrincipleInterfaces must allow easy testing and comparison.Swapping different implementations“You will not be smart enough to pick the solver”Commonly violated in FE codeElements are hard codedAlso avoid assuming structure outside of the interfaceMaking continuous fields have discrete structureTemptation to put metadata in a different placesM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 57 / 166
<strong>PETSc</strong> principles and designExperimentation is Essential!Proof is not currrently enough to examine solversN. M. Nachtigal, S. C. Reddy, and L. N. Trefethen,How fast are nonsymmetric matrix iterations?, SIAMJ. Matrix Anal. Appl., 13, pp.778–795, 1992.Anne Greenbaum, Vlastimil Ptak, and ZdenekStrakos, Any Nonincreasing Convergence Curve isPossible for GMRES, SIAM J. Matrix Anal. Appl., 17(3), pp.465–469, 1996.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 58 / 166
<strong>PETSc</strong> principles and design<strong>The</strong> <strong>PETSc</strong> Programming ModelGoalsPortable, runs everywhereHigh performanceScalable parallelismApproachDistributed memory (“shared-nothing”)No special compilerAccess to data on remote machines through MPIHide within objects the details of the communicationUser orchestrates communication at a higher abstract levelM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 59 / 166
Collectivity<strong>PETSc</strong> principles and designMPI communicators (MPI Comm) specify collectivityProcesses involved in a computationConstructors are collective over a communicatorVecCreate(MPI Comm comm, Vec *x)Use PETSC COMM WORLD for all processes and PETSC COMM SELF for oneSome operations are collective, while others are notcollective: VecNorm()not collective: VecGetLocalSize()Sequences of collective calls must be in the same order on eachprocessM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 60 / 166
<strong>PETSc</strong> principles and designWhat is not in <strong>PETSc</strong>?Higher level representations of PDEsUnstructured mesh generation and manipulationDiscretizations, DealII<strong>PETSc</strong>-CS and SundanceLoad balancingSophisticated visualization capabilitiesMayaViEigenvaluesSLEPc and SIPOptimization and sensitivityTAO and VeltistoM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 61 / 166
<strong>PETSc</strong> principles and designBasic PetscObject UsageEvery object in <strong>PETSc</strong> supports a basic interfaceFunction OperationCreate() create the objectGet/SetName() name the objectGet/SetType() set the implementation typeGet/SetOptionsPrefix() set the prefix for all optionsSetFromOptions() customize object from the command lineSetUp() preform other initializationView() view the objectDestroy() cleanup object allocationAlso, all objects support the -help option.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 62 / 166
Part IV<strong>PETSc</strong> IntegrationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 63 / 166
Initial OperationsApplication IntegrationBe willing to experiment with algorithmsNo optimality without interplay between physics and algorithmicsAdopt flexible, extensible programmingAlgorithms and data structures not hardwiredBe willing to play with the real codeToy models are rarely helpfulIf possible, profile before upgrading or seeking helpAutomatic in <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 64 / 166
<strong>PETSc</strong> IntegrationInitial Operations<strong>PETSc</strong> is a set a library interfacesWe do not seize main()We do not control outputWe propagate errors from underlying packagesWe present the same interfaces in:CC++F77F90PythonSee Gropp in SIAM, OO Methods for Interop SciEng, ’99M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 65 / 166
Integration StagesInitial OperationsVersion ControlIt is impossible to overemphasizeInitializationLinking to <strong>PETSc</strong>ProfilingProfile before changingAlso incorporate command line processingLinear AlgebraFirst <strong>PETSc</strong> data structuresSolversVery easy after linear algebra is integratedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 66 / 166
InitializationInitial OperationsCall PetscInitialize()Setup static data and servicesSetup MPI if it is not alreadyCall PetscFinalize()Calculates logging summaryShutdown and release resourcesChecks compile and linkM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 67 / 166
ProfilingInitial OperationsUse -log summary for a performance profileEvent timingEvent flopsMemory usageMPI messagesCall PetscLogStagePush() and PetscLogStagePop()User can add new stagesCall PetscLogEventBegin() and PetscLogEventEnd()User can add new eventsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 68 / 166
Initial OperationsCommand Line ProcessingCheck for an optionPetscOptionsHasName()Retrieve a valuePetscOptionsGetInt(), PetscOptionsGetIntArray()Set a valuePetscOptionsSetValue()Clear, alias, reject, etc.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 69 / 166
Vector AlgebraVector AlgebraWhat are <strong>PETSc</strong> vectors?Fundamental objects for storing field solutions, right-hand sides, etc.Each process locally owns a subvector of contiguous global dataHow do I create vectors?VecCreate(MPI Comm, Vec *)VecSetSizes(Vec, int n, int N)VecSetType(Vec, VecType typeName)VecSetFromOptions(Vec)Can set the type at runtimeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 70 / 166
Vector AlgebraVector AlgebraA <strong>PETSc</strong> VecHas a direct interface to the valuesSupports all vector space operationsVecDot(), VecNorm(), VecScale()Has unusual operations, e.g. VecSqrt(), VecWhichBetween()Communicates automatically during assemblyHas customizable communication (scatters)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 71 / 166
Creating a VectorVector AlgebraVec x;PetscInt N;PetscErrorCode ierr;ierr = PetscInitialize(&argc, &argv, PETSC NULL, PETSC NULL);CHKERRQierr = PetscOptionsGetInt(PETSC NULL, "-N", &N, PETSC NULL);CHKERierr = VecCreate(PETSC COMM WORLD, &x);CHKERRQ(ierr);ierr = VecSetSizes(x, PETSC DECIDE, N);CHKERRQ(ierr);ierr = VecSetType(x, "mpi");CHKERRQ(ierr);ierr = VecSetFromOptions(x);CHKERRQ(ierr);ierr = PetscFinalize();CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 72 / 166
Parallel AssemblyVectors and MatricesVector AlgebraProcesses may set an arbitrary entryMust use proper interfaceEntries need not be generated locallyLocal meaning the process on which they are stored<strong>PETSc</strong> automatically moves data if necessaryHappens during the assembly phaseM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 73 / 166
Vector AssemblyVector AlgebraA three step processEach process sets or adds valuesBegin communication to send values to the correct processComplete the communicationVecSetValues(Vec v, int n, int rows[], PetscScalarvalues[], mode)mode is either INSERT VALUES or ADD VALUESTwo phase assembly allows overlap of communication andcomputationVecAssemblyBegin(Vec v)VecAssemblyEnd(Vec v)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 74 / 166
Vector AlgebraOne Way to Set the Elements of a Vectorierr = VecGetSize(x, &N);CHKERRQ(ierr);ierr = MPI Comm rank(PETSC COMM WORLD, &rank);CHKERRQ(ierr);if (rank == 0) {for(i = 0, val = 0.0; i < N; i++, val += 10.0) {ierr = VecSetValues(x, 1, &i, &val, INSERT VALUES);CHKERRQ(ierr);}}/* <strong>The</strong>se routines ensure that the data is distributed to the other processes */ierr = VecAssemblyBegin(x);CHKERRQ(ierr);ierr = VecAssemblyEnd(x);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 75 / 166
Vector AlgebraA Better Way to Set the Elements of a Vectorierr = VecGetOwnershipRange(x, &low, &high);CHKERRQ(ierr);for(i = low, val = low*10.0; i < high; i++, val += 10.0) {ierr = VecSetValues(x, 1, &i, &val, INSERT VALUES);CHKERRQ(ierr);}/* <strong>The</strong>se routines ensure that the data is distributed to the other processes */ierr = VecAssemblyBegin(x);CHKERRQ(ierr);ierr = VecAssemblyEnd(x);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 76 / 166
Vector AlgebraSelected Vector OperationsFunction NameOperationVecAXPY(Vec y, PetscScalar a, Vec x)y = y + a ∗ xVecAYPX(Vec y, PetscScalar a, Vec x)y = x + a ∗ yVecWAYPX(Vec w, PetscScalar a, Vec x, Vec y) w = y + a ∗ xVecScale(Vec x, PetscScalar a)x = a ∗ xVecCopy(Vec y, Vec x)y = xVecPointwiseMult(Vec w, Vec x, Vec y)w i = x i ∗ y iVecMax(Vec x, PetscInt *idx, PetscScalar *r) r = maxr iVecShift(Vec x, PetscScalar r)x i = x i + rVecAbs(Vec x) x i = |x i |VecNorm(Vec x, NormType type, PetscReal *r) r = ||x||M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 77 / 166
Vector AlgebraWorking With Local VectorsIt is sometimes more efficient to directly access local storage of a Vec.<strong>PETSc</strong> allows you to access the local storage withVecGetArray(Vec, double *[])You must return the array to <strong>PETSc</strong> when you finishVecRestoreArray(Vec, double *[])Allows <strong>PETSc</strong> to handle data structure conversionsCommonly, these routines are inexpensive and do not involve a copyM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 78 / 166
VecGetArray in CVector AlgebraVec v;PetscScalar *array;PetscInt n, i;PetscErrorCode ierr;ierr = VecGetArray(v, &array);CHKERRQ(ierr);ierr = VecGetLocalSize(v, &n);CHKERRQ(ierr);ierr = PetscSynchronizedPrintf(PETSC COMM WORLD,"First element of local array is %f\n", array[0]);CHKERRQ(ierr);ierr = PetscSynchronizedFlush(PETSC COMM WORLD);CHKERRQ(ierr);for(i = 0; i < n; i++) {array[i] += (PetscScalar) rank;}ierr = VecRestoreArray(v, &array);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 79 / 166
VecGetArray in F77Vector Algebra#include "petsc.h"#include "petscvec.h"Vec v;PetscScalar array(1)PetscOffset offsetPetscInt n, iPetscErrorCode ierrcall VecGetArray(v, array, offset, ierr)call VecGetLocalSize(v, n, ierr)do i=1,narray(i+offset) = array(i+offset) + rankend docall VecRestoreArray(v, array, offset, ierr)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 80 / 166
VecGetArray in F90Vector Algebra#include "petsc.h"#include "petscvec.h"Vec v;PetscScalar pointer :: array(:)PetscInt n, iPetscErrorCode ierrcall VecGetArrayF90(v, array, ierr)call VecGetLocalSize(v, n, ierr)do i=1,narray(i) = array(i) + rankend docall VecRestoreArrayF90(v, array, ierr)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 81 / 166
Matrix AlgebraMatrix AlgebraWhat are <strong>PETSc</strong> matrices?Fundamental objects for storing stiffness matrices and JacobiansEach process locally owns a contiguous set of rowsSupports many data typesAIJ, Block AIJ, Symmetric AIJ, Block Diagonal, etc.Supports structures for many packagesMUMPS, Spooles, SuperLU, UMFPack, DSCPackM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 82 / 166
Matrix AlgebraHow do I create matrices?MatCreate(MPI Comm, Mat *)MatSetSizes(Mat, int m, int n, int M, int N)MatSetType(Mat, MatType typeName)MatSetFromOptions(Mat)Can set the type at runtimeMatSetValues(Mat,...)MUST be used, but does automatic communicationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 83 / 166
Matrix AlgebraMatrix Polymorphism<strong>The</strong> <strong>PETSc</strong> Mat has a single user interface,Matrix assemblyMatSetValues()Matrix-vector multiplicationMatMult()Matrix viewingMatView()but multiple underlying implementations.AIJ, Block AIJ, Symmetric Block AIJ,DenseMatrix-Freeetc.A matrix is defined by its interface, not by its data structure.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 84 / 166
Matrix AssemblyMatrix AlgebraA three step processEach process sets or adds valuesBegin communication to send values to the correct processComplete the communicationMatSetValues(Mat m, m, rows[], n, cols[], values[],mode)mode is either INSERT VALUES or ADD VALUESLogically dense block of valuesTwo phase assembly allows overlap of communication andcomputationMatAssemblyBegin(Mat m, type)MatAssemblyEnd(Mat m, type)type is either MAT FLUSH ASSEMBLY or MAT FINAL ASSEMBLYM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 85 / 166
Matrix AlgebraOne Way to Set the Elements of a MatrixSimple 3-point stencil for 1D Laplacianvalues[0] = −1.0; values[1] = 2.0; values[2] = −1.0;if (rank == 0) { /* Only one process creates matrix */for(row = 0; row < N; row++) {cols[0] = row−1; cols[1] = row; cols[2] = row+1;if (row == 0) {ierr = MatSetValues(A, 1, &row, 2, &cols[1], &values[1], INSERT VALUE} else if (row == N−1) {ierr = MatSetValues(A, 1, &row, 2, cols, values, INSERT VALUES);CHK} else {ierr = MatSetValues(A, 1, &row, 3, cols, values, INSERT VALUES);CHK}}}ierr = MatAssemblyBegin(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);ierr = MatAssemblyEnd(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 86 / 166
Matrix AlgebraParallelSparseMatrixEach process locally owns a submatrix of contiguous global rowsEach submatrix consists of diagonal and off-diagonal partsMatGetOwnershipRange(Mat A, int *start, int *end)start: first locally owned row of global matrixend-1: last locally owned row of global matrixM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 87 / 166
Matrix AlgebraA Better Way to Set the Elements of a MatrixSimple 3-point stencil for 1D Laplacianvalues[0] = −1.0; values[1] = 2.0; values[2] = −1.0;for(row = start; row < end; row++) {cols[0] = row−1; cols[1] = row; cols[2] = row+1;if (row == 0) {ierr = MatSetValues(A, 1, &row, 2, &cols[1], &values[1], INSERT VALUES} else if (row == N−1) {ierr = MatSetValues(A, 1, &row, 2, cols, values, INSERT VALUES);CHKE} else {ierr = MatSetValues(A, 1, &row, 3, cols, values, INSERT VALUES);CHKE}}ierr = MatAssemblyBegin(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);ierr = MatAssemblyEnd(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 88 / 166
Matrix AlgebraWhy Are <strong>PETSc</strong> Matrices That Way?No one data structure is appropriate for all problemsBlocked and diagonal formats provide significant performance benefits<strong>PETSc</strong> has many formats and makes it easy to add new data structuresAssembly is difficult enough without worrying about partitioning<strong>PETSc</strong> provides parallel assembly routinesAchieving high performance still requires making most operations localHowever, programs can be incrementally developed.Matrix decomposition in contiguous chunks is simpleMakes interoperation with other codes easierFor other ordering, <strong>PETSc</strong> provides “Application Orderings” (AO)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 89 / 166
SolversAlgebraic SolvesExplicit:Field variables are updated using local neighbor informationSemi-implicit:Some subsets of variables are updated with global solvesOthers with direct local updatesImplicit:Most or all variables are updated in a single global solveM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 90 / 166
Linear SolversKrylov MethodsAlgebraic SolvesUsing <strong>PETSc</strong> linear algebra, just add:KSPSetOperators(KSP ksp, Mat A, Mat M, MatStructureflag)KSPSolve(KSP ksp, Vec b, Vec x)Can access subobjectsKSPGetPC(KSP ksp, PC *pc)Preconditioners must obey <strong>PETSc</strong> interfaceBasically just the KSP interfaceCan change solver dynamically from the command line, -ksp typeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 91 / 166
Nonlinear SolversNewton MethodsAlgebraic SolvesUsing <strong>PETSc</strong> linear algebra, just add:SNESSetFunction(SNES snes, Vec r, residualFunc, void*ctx)SNESSetJacobian(SNES snes, Mat A, Mat M, jacFunc, void*ctx)SNESSolve(SNES snes, Vec b, Vec x)Can access subobjectsSNESGetKSP(SNES snes, KSP *ksp)Can customize subobjects from the cmd lineSet the subdomain preconditioner to ILU with -sub pc type iluM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 92 / 166
Basic Solver UsageAlgebraic SolvesWe will illustrate basic solver usage with SNES.Use SNESSetFromOptions() so that everything is set dynamicallyUse -snes type to set the type or take the defaultOverride the tolerancesUse -snes rtol and -snes atolView the solver to make sure you have the one you expectUse -snes viewFor debugging, monitor the residual decreaseUse -snes monitorUse -ksp monitor to see the underlying linear solverM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 93 / 166
Algebraic Solves3rd Party Solvers in <strong>PETSc</strong>1 Sequential LUILUDT (SPARSEKIT2, Yousef Saad, U of MN)EUCLID & PILUT (Hypre, David Hysom, LLNL)ESSL (IBM)SuperLU (Jim Demmel and Sherry Li, LBNL)MatlabUMFPACK (Tim Davis, U. of Florida)LUSOL (MINOS, Michael Saunders, Stanford)2 Parallel LUMUMPS (Patrick Amestoy, IRIT)SPOOLES (Cleve Ashcroft, Boeing)SuperLU Dist (Jim Demmel and Sherry Li, LBNL)3 Parallel CholeskyDSCPACK (Padma Raghavan, Penn. State)4 XYTlib - parallel direct solver (Paul Fischer and Henry Tufo, ANL)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 94 / 166
Algebraic Solves3rd Party Preconditioners in <strong>PETSc</strong>1 Parallel ICCBlockSolve95 (Mark Jones and Paul Plassman, ANL)2 Parallel ILUBlockSolve95 (Mark Jones and Paul Plassman, ANL)3 Parallel Sparse Approximate InverseParasails (Hypre, Edmund Chow, LLNL)SPAI 3.0 (Marcus Grote and Barnard, NYU)4 Sequential Algebraic MultigridRAMG (John Ruge and Klaus Steuben, GMD)SAMG (Klaus Steuben, GMD)5 Parallel Algebraic MultigridPrometheus (Mark Adams, PPPL)BoomerAMG (Hypre, LLNL)ML (Trilinos, Ray Tuminaro and Jonathan Hu, SNL)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 95 / 166
More AbstractionsHigher Level Abstractions<strong>The</strong> <strong>PETSc</strong> DA class is a topology interface.Structured grid interfaceFixed simple topologySupports stencils, communication, reorderingNo idea of operatorsNice for simple finite differences<strong>The</strong> <strong>PETSc</strong> DM class is a hierarchy interface.Supports multigridDMMG combines it with the MG preconditionerAbstracts the logic of multilevel methodsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 96 / 166
More Abstractions3 Ways To Use <strong>PETSc</strong>User manages all topology (just use Vec and Mat)<strong>PETSc</strong> manages single topology (use DA)<strong>PETSc</strong> manages a hierarchy (use DM)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 97 / 166
Part VAdvanced <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 98 / 166
SNESCallbacksNonlinear Equations<strong>The</strong> SNES interface is based upon callback functionsSNESSetFunction()SNESSetJacobian()When <strong>PETSc</strong> needs to evaluate the nonlinear residual F (x), the solvercalls the user’s function inside the application.<strong>The</strong> user function get application state through the ctx variable. <strong>PETSc</strong>never sees application data.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 99 / 166
SNES FunctionNonlinear Equations<strong>The</strong> user provided function which calculates the nonlinear residual hassignaturePetscErrorCode (*func)(SNES snes, Vec x, Vec r, void *ctx)x: <strong>The</strong> current solutionr: <strong>The</strong> residualctx: <strong>The</strong> user context passed to SNESSetFunction()Use this to pass application information, e.g. physical constantsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 100 / 166
SNES JacobianNonlinear Equations<strong>The</strong> user provided function which calculates the Jacobian has signaturePetscErrorCode (*func)(SNES snes, Vec x, Mat *J, Mat *M,MatStructure *flag, void *ctx)x: <strong>The</strong> current solutionJ: <strong>The</strong> JacobianM: <strong>The</strong> Jacobian preconditioning matrix (possibly J itself)ctx: <strong>The</strong> user context passed to SNESSetFunction()Use this to pass application information, e.g. physical constantsPossible MatrStructure values are:SAME NONZERO PATTERN, DIFFERENT NONZERO PATTERN,. . .Alternatively, you can usea builtin sparse finite difference approximationautomatic differentiationAD support via ADIC/ADIFOR (P. Hovland and B. Norris from ANL)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 101 / 166
SNES VariantsNonlinear EquationsLine search strategiesTrust region approachesPseudo-transient continuationMatrix-free variantsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 102 / 166
Nonlinear EquationsFinite Difference Jacobians<strong>PETSc</strong> can compute and explicitly store a Jacobian via 1st-order FDDenseActivated by -snes fdComputed by SNESDefaultComputeJacobian()Sparse via coloringsColoring is created by MatFDColoringCreate()Computed by SNESDefaultComputeJacobianColor()Can also use Matrix-free Newton-Krylov via 1st-order FDActivated by -snes mf without preconditioningActivated by -snes mf operator with user-defined preconditioningUses preconditioning matrix from SNESSetJacobian()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 103 / 166
Nonlinear EquationsSNES Example: Driven CavityVelocity-vorticity formulationFlow driven by lid and/or bouyancyLogically regular gridParallelized with DAFinite difference discretizationAuthored by David Keyes$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 104 / 166
Nonlinear EquationsDriven Cavity Application Contexttypedef struct {/*—– basic application data —–*/double lid velocity; /* Velocity of the lid */double prandtl, grashof; /* Prandtl and Grashof numbers */int mx, my; /* Grid points in x and y */int mc; /* Degrees of freedom per node */PetscTruth draw contours; /* Flag for drawing contours *//*—– parallel data —–*/MPI Comm comm; /* Communicator */DA da; /* Distributed array */Vec localX, localF; /* Local ghosted solution and residual */} AppCtx;$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 105 / 166
Nonlinear EquationsDriven Cavity Residual EvaluationDrivenCavityFunction(SNES snes, Vec X, Vec F, void *ptr){AppCtx *user = (AppCtx *) ptr;int istart, iend, jstart, jend; /* local starting and ending grid pointPetscScalar *f; /* local vector data */PetscReal grashof = user−>grashof;PetscReal prandtl = user−>prandtl;PetscErrorCode ierr;/* Not Shown: Code to communicate nonlocal ghost point data (scatters) */ierr = VecGetArray(F, &f); CHKERRQ(ierr);/* Not Shown: Code to compute local function components */ierr = VecRestoreArray(F, &f); CHKERRQ(ierr);return 0;}$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 106 / 166
Nonlinear EquationsBetter Driven Cavity Residual EvaluationPetscErrorCode DrivenCavityFuncLocal(DALocalInfo *info,Field **x,Field **f,v{/* Not Shown: Handle boundaries *//* Compute over the interior points */for(j = info−>ys; j < info−>xs+info−>xm; j++) {for(i = info−>xs; i < info−>ys+info−>ym; i++) {/* Not Shown: convective coefficients for upwinding *//* U velocity */u = x[j][i].u;uxx = (2.0*u − x[j][i−1].u − x[j][i+1].u)*hydhx;uyy = (2.0*u − x[j−1][i].u − x[j+1][i].u)*hxdhy;f[j][i].u = uxx + uyy − .5*(x[j+1][i].omega−x[j−1][i].omega)*hx;/* Not Shown: V velocity, Omega, Temperature */}}}$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 107 / 166
What is a DA?Structured GridsDA is a topology interface handling parallel data layout on structured gridsHandles local and global indicesDAGetGlobalIndices() and DAGetAO()Provides local and global vectorsDAGetGlobalVector() and DAGetLocalVector()Handles ghost values coherenceDAGetGlobalToLocal() and DAGetLocalToGlobal()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 108 / 166
Creating a DAStructured GridsDACreate1d(comm, DAPeriodicType wrap, M, dof, s, lm[], DA*da)wrap: Specifies periodicityDA NONPERIODIC or DA XPERIODICM: Number of grid points in x-directiondof: Degrees of freedom per nodes: <strong>The</strong> stencil widthlm: Alternative array of local sizesUse PETSC NULL for the defaultM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 109 / 166
Creating a DAStructured GridsDACreate2d(comm, wrap, type, M, N, m, n, dof, s, lm[],ln[], DA *da)wrap: Specifies periodicityDA NONPERIODIC, DA XPERIODIC, DA YPERIODIC, or DA XYPERIODICtype: Specifies stencilDA STENCIL BOX or DA STENCIL STARM/N: Number of grid points in x/y-directionm/n: Number of processes in x/y-directiondof: Degrees of freedom per nodes: <strong>The</strong> stencil widthlm/n: Alternative array of local sizesUse PETSC NULL for the defaultM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 110 / 166
Ghost ValuesStructured GridsTo evaluate a local function f (x), each process requiresits local portion of the vector xits ghost values, bordering portions of x owned by neighboringprocessesLocal NodeGhost NodeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 111 / 166
Structured GridsDA Global NumberingsProc 2 Proc 325 26 27 28 2920 21 22 23 2415 16 17 18 1910 11 12 13 145 6 7 8 90 1 2 3 4Proc 0 Proc 1Natural numberingProc 2 Proc 321 22 23 28 2918 19 20 26 2715 16 17 24 256 7 8 13 143 4 5 11 120 1 2 9 10Proc 0 Proc 1<strong>PETSc</strong> numberingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 112 / 166
Structured GridsDA Global vs. Local NumberingGlobal: Each vertex belongs to a unique process and has a unique idLocal: Numbering includes ghost vertices from neighboring processesProc 2 Proc 3X X X X XX X X X X12 13 14 15 X8 9 10 11 X4 5 6 7 X0 1 2 3 XProc 0 Proc 1Local numberingProc 2 Proc 321 22 23 28 2918 19 20 26 2715 16 17 24 256 7 8 13 143 4 5 11 120 1 2 9 10Proc 0 Proc 1Global numberingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 113 / 166
DA VectorsStructured Grids<strong>The</strong> DA object contains only layout (topology) informationAll field data is contained in <strong>PETSc</strong> VecsGlobal vectors are parallelEach process stores a unique local portionDACreateGlobalVector(DA da, Vec *gvec)Local vectors are sequential (and usually temporary)Each process stores its local portion plus ghost valuesDACreateLocalVector(DA da, Vec *lvec)includes ghost values!M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 114 / 166
Updating GhostsStructured GridsTwo-step process enables overlapping computation and communicationDAGlobalToLocalBegin(da, gvec, mode, lvec)gvec provides the datamode is either INSERT VALUES or ADD VALUESlvec holds the local and ghost valuesDAGlobalToLocal End(da, gvec, mode, lvec)Finishes the communication<strong>The</strong> process can be reversed with DALocalToGlobal().M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 115 / 166
DA Local FunctionStructured Grids<strong>The</strong> user provided function which calculates the nonlinear residual in 2Dhas signaturePetscErrorCode (*lfunc)(DALocalInfo *info, PetscScalar **x,PetscScalar **r, void *ctx)info: All layout and numbering informationx: <strong>The</strong> current solutionNotice that it is a multidimensional arrayr: <strong>The</strong> residualctx: <strong>The</strong> user context passed to DASetLocalFunction()<strong>The</strong> local DA function is activated by callingSNESSetFunction(snes, r, SNESDAFormFunction, ctx)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 116 / 166
DA Local JacobianStructured Grids<strong>The</strong> user provided function which calculates the nonlinear residual in 2Dhas signaturePetscErrorCode (*lfunc)(DALocalInfo *info, PetscScalar **x,Mat J, void *ctx)info: All layout and numbering informationx: <strong>The</strong> current solutionJ: <strong>The</strong> Jacobianctx: <strong>The</strong> user context passed to DASetLocalFunction()<strong>The</strong> local DA function is activated by callingSNESSetJacobian(snes, J, J, SNESDAComputeJacobian, ctx)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 117 / 166
DA StencilsStructured GridsBoth the box stencil and star stencil are available.proc 10proc 10proc 0 proc 1proc 0 proc 1Box StencilStar StencilM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 118 / 166
Structured GridsSetting Values for Finite Differences<strong>PETSc</strong> providesMatSetValuesStencil(Mat A, m, MatStencil idxm[], n,MatStencil idxn[], values[], mode)Each row or column is actually a MatStencilThis specifies grid coordinates and a component if necessaryCan imagine for unstructured grids, they are vertices<strong>The</strong> values are the same logically dense block in rows and columnsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 119 / 166
Structured GridsMapping Between Global NumberingsNatural global numberingConvenient for visualization, boundary conditions, etc.Convert between global numbering schemes using AODAGetAO(DA da, AO *ao)Handled automatically by some utilities (e.g., VecView()) for DAvectorsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 120 / 166
Structured GridsDA Example: BratuCreate SNES and DAUse DASetLocalFunction() and DASetLocalJacobian() to setuser callbacksUse DAGetMatrix() to get DA matrix for SNESUse SNESDAFormFunction() and SNESDAComputeJacobian() forSNES callbackCould also use FormFunctionMatlab()Could also use SNESDefaultComputeJacobian()$PETCS DIR/src/snes/examples/tutorials/ex5.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 121 / 166
Structured GridsDA Example: BratuPetscErrorCode LocalFunc(DALocalInfo *info, PetscScalar **x, PetscScalar **f{for(j = info−>ys; j < info−>ys + info−>ym; j++) {for(i = info−>xs; i < info−>xs + info−>xm; i++) {if (i == 0 | | j == 0 | | i == info−>mx−1 | | j == info−>my−1) {f[j][i] = x[j][i];} else {u = x[j][i];u xx = −(x[j][i+1] − 2.0*u + x[j][i−1])*(hy/hx);u yy = −(x[j+1][i] − 2.0*u + x[j−1][i])*(hx/hy);f[j][i] = u xx + u yy − hx*hy*lambda*PetscExpScalar(u);}}}}$PETCS DIR/src/snes/examples/tutorials/ex5.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 122 / 166
Structured GridsDA Example: Bratuint LocalJac(DALocalInfo *info, PetscScalar **x, Mat jac, void *ctx){for(j = info−>ys; j < info−>ys + info−>ym; j++) {for(i = info−>xs; i < info−>xs + info−>xm; i++) {row.j = j; row.i = i;if (i == 0 | | j == 0 | | i == info−>mx−1 | | j == info−>my−1) {v[0] = 1.0;ierr = MatSetValuesStencil(jac, 1, &row, 1, &row, v, INSERT VALUES} else {v[0] = −(hx/hy); col[0].j = j−1; col[0].i = i;v[1] = −(hy/hx); col[1].j = j; col[1].i = i−1;v[2] = 2.0*(hy/hx+hx/hy) − hx*hy*lambda*PetscExpScalar(x[j][i]);v[3] = −(hy/hx); col[3].j = j; col[3].i = i+1;v[4] = −(hx/hy); col[4].j = j+1; col[4].i = i;ierr = MatSetValuesStencil(jac, 1, &row, 5, col, v, INSERT VALUES);C} } } }$PETCS DIR/src/snes/examples/tutorials/ex5.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 123 / 166
Part VI<strong>PETSc</strong> ExtensibilityM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 124 / 166
Extending the BuildExtending ConfigureSee python/<strong>PETSc</strong>/packages/*.py for examplesAdd module with class Configure derived fromconfig.base.ConfigureCan also derive from <strong>PETSc</strong>.package.PackageImplement configure() and configureHelp()Customize <strong>PETSc</strong> through the make systemaddDefine()addTypedef(), addPrototype()addMakeMacro(), addMakeRule()Deprecated addSubstitution()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 125 / 166
Linking to <strong>PETSc</strong>Extending the BuildNothing but the librariesUser can custom linkUsing only <strong>PETSc</strong> build variablesInclude bmake/common/variablesAlso use <strong>PETSc</strong> build rulesInclude bmake/common/baseAlso makes available 3rd party packagesM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 126 / 166
Extending ClassesLayering over <strong>PETSc</strong>SLEPc, TAO, and MagParInfrastructure and linear algebraUse <strong>PETSc</strong> object structureDynamic dispatchUse dynamic linking facilitiesRuntime type selectionUse debugging and profiling toolsMemory management, runtime type checkingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 127 / 166
Extending ClassesAdding an ImplementationSee src/ksp/pc/impls/jacobi/jacobi.cImplement the interface methodsFor Jacobi, PCSetUp(), PCApply(),...Define a constructorAllocate and initialize the class structureFill in the function tableMust have C linkageRegister the constructorSee src/ksp/ksp/interface/dlregis.cMaps a string (class name) to the constructorUsually uses PetscFListAdd()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 128 / 166
Extending ClassesAdding a New WrapperSee src/ts/impls/implicit/pvode/petscpvode.cJust like an ImplementationMethods dispatch to 3rd party softwareNeed to alter local makefileAdd a requirespackage lineAdd include variable to CPPFLAGSUsually requires configure additionsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 129 / 166
Extending ClassesAdding a New SubtypeSee src/mat/impls/aij/seq/umfpack/umfpack.cHave to virtualize methods by handDefine a constructorChange type name first to correct nameCall MatSetType() for base typeReplace (and save) overidden methodsConstruct any specific dataMust also define a conversion to the base typeOnly called in destructor right nowM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 130 / 166
Extending ClassesAdding a New TypeSee src/ksp/ksp/kspimpl.hDefine a methods structure (interface)A list of function pointersDefine a type structureFirst member is PETSCHEADER(struct Ops)Possibly other data members common to the typeA void *data for implementation structuresM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 131 / 166
Extending ClassesAdding a New TypeSee src/ksp/ksp/interface/dlregis.cDefine a package initializer (PetscDLLibraryRegister)Called when the DLL is loadedAlso called from generic create if linking staticallyRegisters classes and events (see below)Registers any default implementation constructorsSetup any info or summary exclusionsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 132 / 166
Extending ClassesAdding a New TypeSee src/ksp/ksp/interface/itcreate.cDefine a generic createCall package initializer if linking staticallyCall PetscHeaderCreate()Initialize any default data membersDefine a setType() methodCall the destructor of any current implementationCall the constructor of the given implementationSet the type nameM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 133 / 166
Extending ClassesAdding a New TypeThings Swept Under <strong>The</strong> RugNeed setFromOptions() which allows implementation selection atruntimeHave to manage the database of registered constructorsView and destroy functions handled a little funny due to historicalreasonsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 134 / 166
Part VII<strong>PETSc</strong> OptimizationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 135 / 166
Marc Garbey’s ProblemProblem Domainρ=1ρ=100M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 136 / 166
Marc Garbey’s ProblemProblem StatementInhomogeneous Laplacian in 2D. Modeled by the partial differentialequation∇ · (ρ∇u) = f on Ω = [0, 1] × [0, 1],with forcing functionwith Dirichlet boundary conditionsf(x, y) = e −(1−x)2 /ν e −(1−y)2 /ν ,u = f(x, y) on ∂Ω,or homogeneous Neumman boundary conditionsˆn · ∇u = 0 on ∂Ω.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 137 / 166
Revision 1.1Marc Garbey’s ProblemFor the initial try, we modify a common <strong>PETSc</strong> DMMG skeleton:Use a simple FD 5-point stencil discretizationUse a structured grid (DA)Use a hierarchical method (DMMG)Only implement Dirichlet BC (simple masking)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 138 / 166
Revision 1.2Marc Garbey’s ProblemNow utilize some more <strong>PETSc</strong> features:Add UserContext structure to hold ν and the BC typeNeed to set the context at each DM levelAdd Neumann BC using a MatNullSpaceUsed to project onto the orthogonal complementKSPSetNullSpace()Set parameters from the command linePetscOptionsBegin(), PetscOptionsEnd()PetscOptionsScalar(), PetscOptionsString()By hand, PetscOptionsGetScalar(), PetscOptionsGetString()Fixed scaling for anisotropic gridsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 139 / 166
Revision 1.6Marc Garbey’s ProblemBarry fixed the example to converge nicely:Set nullspace on all DM levelsActually set in the smoother (KSP)Same idea as the user contextNow completely handled by DMMGSetNullSpace()Remove the null space component of the rhsMatNullSpaceRemove()Usually handled by the modelAdd a shift to the coarse grid LU for Neumann BCSystem is singular so augment with the identityOne extra step if coarse solve is redundantFix weighting for center point of Neumann conditionDepends on the number of missing pointsAlso use PetscOptionsElist() to set BCCan provide a nice listbox using the GUIM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 140 / 166
DMMG GridsMarc Garbey’s Problem<strong>The</strong> use specifies the coarse grid, and then DMMG successively refines it.In our problem, we being with a 3 × 3 gridWe LU factor a 9 × 9 matrixBy level 10, we have a 1025 × 1025 gridOur final solution has 1,050,625 unknownsSet the initial grid using -da grid x and -da grid ySet the number of levels using -dmmg nlevelsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 141 / 166
Marc Garbey’s ProblemMesh Independence<strong>The</strong> iteration number should be independent of the mesh size, or thenumber of levels.Levels Unknowns KSP Iterations2 25 24 289 26 4225 38 66049 210 1050625 2We have used Dirichlet BC aboveM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 142 / 166
Viewing the DAMarc Garbey’s Problem-da view outputs an ascii description-da view draw display the gridFirst the grid itself is drawnGlobal variable numbers are then providedFinally ghost variable numbers are shown for error checking-vec view draw draws a contour plot for DA vectors<strong>The</strong> contour grid can be shown with -draw contour gridM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 143 / 166
Marc Garbey’s ProblemForcing Function<strong>The</strong> rhs of our linear system drives the solution:M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 144 / 166
Marc Garbey’s ProblemDirichlet SolutionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 145 / 166
Marc Garbey’s ProblemNeumann SolutionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 146 / 166
Marc Garbey’s ProblemMultigrid OptionsChoose V-cycle or W-cycle using -pc mg cyclesCan set the iteration method using -pc mg typeMULTIPLICATIVE, ADDITIVE, FULL, KASKADEChoose the number of steps in both the up and down smoothers-pc mg smoothup, -pc mg smoothdown<strong>The</strong> coarse problem has prefix mg coarse-mg coarse pc type, -mg coarse ksp maxitEach level k has prefix mg levels k-mg levels 1 ksp type, -mg levels 2 pc ilu fillCan automatically form coarse operators with the Galerkin process-pc mg galerkinDMMG provides these automatically by interpolationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 147 / 166
Part VIIIFuture PlansM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 148 / 166
What is New?New classes and tools will be added to the existing <strong>PETSc</strong>framework:Unstructured mesh generation, refinement, and coarseningUnstructured multilevel algorithmsA high-level language for specification of weak forms (with FFC)Automatic generation of arbitrary finite elements (with FIAT)Platform independent, system for configure, build, anddistributionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 149 / 166
<strong>PETSc</strong> PartsTo make the new functionality easier for users to understand, <strong>PETSc</strong> willbe divided conceptually into two parts:<strong>PETSc</strong>-AS will contain the components related to algebraic solvers,which is the current <strong>PETSc</strong> distribution.<strong>PETSc</strong>-CS will contain the support for continuum problems phrasedin weak form. <strong>The</strong>se modules will make use of the <strong>PETSc</strong>-ASmodules in our implementation, but this is not strictly necessary.However, the framework will remain integrated:Single release version, version control, and build systemEach module is self-contained (install only what is necessary)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 150 / 166
BuildSystemBuildPure Python and freely availableCurrently handles configure and buildneeds install and distributionHandles generated code through ASEcan uses md5, timestamp, diff, etc.Handles shared libraries for many architecturesLinux, Mac OSX, WindowsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 151 / 166
Mesh CapabilitiesMesh2D Delaunay generation and refinementTriangle3D Delaunay generation and refinementTetGen3D Delaunay generation, coarsening, and refinementAllows quadratic Bezier elementsAllows efficient, dynamic mesh updateTUMBLEM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 152 / 166
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distribute
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundary
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsen
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh source
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh sourceoverlap, fusion
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh sourceoverlap, fusionAllow fine grain parallelism, and enable Field operations
How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh sourceoverlap, fusionAllow fine grain parallelism, and enable Field operationsImplemented by low-level Sieve operationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 153 / 166
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimension
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a mesh
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, update
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchy
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchyHeart of FEM, Multigrid, Domain Decomposition
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchyHeart of FEM, Multigrid, Domain DecompositiondistributeSection, distributeSieve
How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchyHeart of FEM, Multigrid, Domain DecompositiondistributeSection, distributeSievePersistent communication structures for irregular dataM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 154 / 166
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independent
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3D
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independent
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quads
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independent
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independentSame assembly code works for any finite element
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independentSame assembly code works for any finite elementOnly variation is in local integration routine (tied to weak form)
AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independentSame assembly code works for any finite elementOnly variation is in local integration routine (tied to weak form)Transparent to parallelismM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 155 / 166
MeshWhat is Sieve Good For?Parallelization
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layout
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybrid
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy density
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologies
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torus
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systems
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material models
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patterns
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patternsAutomatic discovery (∆)
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patternsAutomatic discovery (∆)Arbitrary domains and data layout
MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patternsAutomatic discovery (∆)Arbitrary domains and data layoutPersistent communication structures (<strong>PETSc</strong> VecScatter)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 156 / 166
IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy code
IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structures
IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLith
IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLithReorganization for new development
IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLithReorganization for new developmentBetter describe common FEM operations
IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLithReorganization for new developmentBetter describe common FEM operationsExample: EqSim/PyLith integrationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 157 / 166
PyLith in ParallelMeshPyLith can now handle moderately sized meshes in parallel:Mesh courtesy of Carl Gable, LANLM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 158 / 166
Sieve and <strong>PETSc</strong>MeshSieve is the unstructured mesh component of <strong>PETSc</strong>Wrapper obeys the DM semanticsWorks with <strong>PETSc</strong> Viewers
Sieve and <strong>PETSc</strong>MeshSieve is the unstructured mesh component of <strong>PETSc</strong>Wrapper obeys the DM semanticsWorks with <strong>PETSc</strong> ViewersSieve can be installed separately (almost)FEniCS Project component
Sieve and <strong>PETSc</strong>MeshSieve is the unstructured mesh component of <strong>PETSc</strong>Wrapper obeys the DM semanticsWorks with <strong>PETSc</strong> ViewersSieve can be installed separately (almost)FEniCS Project componentIncludes support for global, contiguous vectorsGlobal and local numberingsAutomatic, persistent scattersM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 159 / 166
for(iterator e iter = elements−>begin(); e iter != elements−>end(); ++e iter)ElementGeometry(mesh, *e iter, v0, Jac, PETSC NULL, &detJ);PetscMemzero(elementVec, NUM BASIS*sizeof(PetscScalar));for(int q = 0; q < NUM QUADRATURE POINTS; q++) {xi = points[q*2+0] + 1.0;eta = points[q*2+1] + 1.0;x q = Jac[0]*xi + Jac[1]*eta + v0[0];y q = Jac[2]*xi + Jac[3]*eta + v0[1];for(i = 0; i < NUM BASIS; i++) {elementVec[i] += Basis[q*NUM BASIS+i]*f(x q, y q)*weights[q]*detJ;}}field−>updateAdd(*e iter, elementVec);}M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 160 / 166Rhs EvaluationMeshconst patch type patch = 0;const Obj& elements = topology−>heightStratum(patch, 0);PetscScalarelementVec[NUM BASIS];
Jacobian EvaluationMeshPetscScalar elementMat[NUM BASIS*NUM BASIS];for(iterator e iter = elements−>begin(); e iter != elements−>end(); ++e iter)ElementGeometry(mesh, *e iter, v0, Jac, Jinv, &detJ);PetscMemzero(elementMat, NUM BASIS*NUM BASIS*sizeof(PetscScalar));for(int q = 0; q < NUM QUADRATURE POINTS; q++) {for(i = 0; i < NUM BASIS; i++) {testDer[0] = Jinv[0]*BasisDers[(q*NUM BASIS+i)*2+0] + Jinv[2]*BasisDtestDer[1] = Jinv[1]*BasisDers[(q*NUM BASIS+i)*2+0] + Jinv[3]*BasisDfor(j = 0; j < NUM BASIS; j++) {basisDer[0] = Jinv[0]*BasisDers[(q*NUM BASIS+j)*2+0] + Jinv[2]*BasbasisDer[1] = Jinv[1]*BasisDers[(q*NUM BASIS+j)*2+0] + Jinv[3]*BaselementMat[i*NUM BASIS+j] +=(testDer[0]*basisDer[0] + testDer[1]*basisDer[1])*weights[q]*detJ;}}}updateOperator(jac, field, globalOrder, *e iter, elementMat, ADD VALUES);}M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 161 / 166
FIAT IntegrationDiscretizationFinite Element Integrator and Tabulator by Rob KirbyIn order to generate a quadrature routines, we need:A differential form to integrateAn element (usually a family and degree) using FIATA quadrature ruleWe then produceA C integration routinesA Python module wrapperOptional Ferari optimized routinesOptional element assembly loopFIAT is part of the FEniCS project, as is the <strong>PETSc</strong> Sieve moduleM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 162 / 166
Weak Form LanguagesA Language for Weak FormsIn collaboration withAnders Logg of SimulaRob Kirby of Texas TechWe have developed a small language for weak forms, based directly on anAST representation.<strong>The</strong> Fenics Form Compiler (FFC) processes each form algebraically,allowing some simplification and optimization, before passing it on to theintegration generation routines.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 163 / 166
Input LanguageWeak Form LanguagesWe have a simple text language for input, incorporating:Arithmetic, +, -, *, /, ˆ, () abs(x)Coordinate functions, cos(x) exp(x)Continuum fields (known and unknown)Dual pairing, 〈 , 〉Matrix operations, trans(u) det(u) vec(u)Differential operators, grad(u) div(u) curl(u)We must use expression graphs for efficiency.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 164 / 166
ExamplesWeak Form LanguagesPoisson Equation - Vector Poisson Equation- Linear Elasticity- Stokes Equation - + - + M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 165 / 166
ReferencesConclusionsDocumentation: http://www.mcs.anl.gov/petsc/docs<strong>PETSc</strong> Users manualManual pagesMany hyperlinked examplesFAQ, Troubleshooting info, installation info, etc.Publications: http://www.mcs.anl.gov/petsc/publicationsResearch and publications that make use <strong>PETSc</strong>MPI Information: http://www.mpi-forum.orgUsing MPI (2nd Edition), by Gropp, Lusk, and SkjellumDomain Decomposition, by Smith, Bjorstad, and GroppM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 166 / 166