11.07.2015 Views

PETSc Tutorial - The ACTS Toolkit

PETSc Tutorial - The ACTS Toolkit

PETSc Tutorial - The ACTS Toolkit

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>PETSc</strong> <strong>Tutorial</strong><strong>PETSc</strong> TeamPresented by Matthew KnepleyMathematics and Computer Science DivisionArgonne National Laboratory<strong>ACTS</strong> WorkshopLawrence Berkeley National LaboratoryAugust 23, 2006M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 1 / 166


Units1Getting Started with <strong>PETSc</strong>2Common <strong>PETSc</strong> Usage3<strong>PETSc</strong> Essentials4<strong>PETSc</strong> Integration5Advanced <strong>PETSc</strong>6<strong>PETSc</strong> Extensibility7<strong>PETSc</strong> Optimization8Future PlansM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 2 / 166


Part IGetting Started with <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 3 / 166


Unit ObjectivesWhat is <strong>PETSc</strong>?Introduce thePortable Extensible <strong>Toolkit</strong> for Scientific ComputationRetrieve, Configure, Build, and Run a <strong>PETSc</strong> ExampleEmpower students to learn more about <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 4 / 166


What is <strong>PETSc</strong>?What I Need From YouTell me if you do not understandTell me if an example does not workSuggest better wording or figuresFollowup problems at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 5 / 166


What is <strong>PETSc</strong>?How Can We Help?Provide documentationAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166


What is <strong>PETSc</strong>?How Can We Help?Provide documentationQuickly answer questionsAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166


What is <strong>PETSc</strong>?How Can We Help?Provide documentationQuickly answer questionsHelp installAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166


What is <strong>PETSc</strong>?How Can We Help?Provide documentationQuickly answer questionsHelp installGuide large scale code developmentAnswer email at petsc-maint@mcs.anl.govM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 6 / 166


What is <strong>PETSc</strong>?<strong>The</strong> Role of <strong>PETSc</strong>Developing parallel, nontrivial PDE solvers that deliverhigh performance is still difficult and requiresmonths (or even years) of concentrated effort.<strong>PETSc</strong> is a toolkit that can ease these difficultiesand reduce the development time, but it is not ablack-box PDE solver, nor a silver bullet.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 7 / 166


What is <strong>PETSc</strong>?What is <strong>PETSc</strong>?A freely available and supported research codeDownload from http://www.mcs.anl.gov/petscFree for everyone, including industrial usersHyperlinked manual, examples, and manual pages for all routinesHundreds of tutorial-style examplesSupport via email: petsc-maint@mcs.anl.govUsable from C, C++, Fortran 77/90, and PythonM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 8 / 166


What is <strong>PETSc</strong>?What is <strong>PETSc</strong>?Portable to any parallel system supporting MPI, including:Tightly coupled systemsCray T3E, SGI Origin, IBM SP, HP 9000, Sub EnterpriseLoosely coupled systems, such as networks of workstationsCompaq,HP, IBM, SGI, Sun, PCs running Linux or Windows<strong>PETSc</strong> HistoryBegun September 1991Over 8,500 downloads since 1995 (version 2), currently 250 per month<strong>PETSc</strong> Funding and SupportDepartment of EnergySciDAC, MICS ProgramNational Science FoundationCIG, CISE, Multidisciplinary Challenge ProgramM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 9 / 166


What is <strong>PETSc</strong>?TimelineActive Developers654321<strong>PETSc</strong> 1 release+Barry+Bill<strong>PETSc</strong> 2 release+Lois­Bill+Satish+Hong+Kris­Lois+Matt+Victor­Kris1991 1993 1995 1996 2000 2001 2003 2005M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 10 / 166


What is <strong>PETSc</strong>?What Can We Handle?<strong>PETSc</strong> has run problems with over 500 million unknownshttp://www.scconference.org/sc2004/schedule/pdfs/pap111.pdf


What is <strong>PETSc</strong>?What Can We Handle?<strong>PETSc</strong> has run problems with over 500 million unknownshttp://www.scconference.org/sc2004/schedule/pdfs/pap111.pdf<strong>PETSc</strong> has run on over 6,000 processors efficientlyftp://info.mcs.anl.gov/pub/tech reports/reports/P776.ps.Z


What is <strong>PETSc</strong>?What Can We Handle?<strong>PETSc</strong> has run problems with over 500 million unknownshttp://www.scconference.org/sc2004/schedule/pdfs/pap111.pdf<strong>PETSc</strong> has run on over 6,000 processors efficientlyftp://info.mcs.anl.gov/pub/tech reports/reports/P776.ps.Z<strong>PETSc</strong> applications have run at 2 TeraflopsLANL PFLOTRAN codeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 11 / 166


Who uses and develops <strong>PETSc</strong>?Who Uses <strong>PETSc</strong>?Computational ScientistsPyLith (TECTON), Underworld, Columbia groupAlgorithm DevelopersIterative methods and Preconditioning researchersPackage DevelopersSLEPc, TAO, MagPar, StGermain, DealIIM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 12 / 166


Who uses and develops <strong>PETSc</strong>?<strong>The</strong> <strong>PETSc</strong> TeamHong Zhang Victor Eijkhout Lois McInnesM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 13 / 166Bill Gropp Barry Smith Satish BalayDinesh Kaushik Kris Buschelman Matt Knepley


How can I get <strong>PETSc</strong>?Downloading <strong>PETSc</strong><strong>The</strong> latest tarball is on the <strong>PETSc</strong> siteftp://ftp.mcs.anl.gov/pub/petsc/petsc.tar.gzWe no longer distribute patches (everything is in the distribution)<strong>The</strong>re is a Debian package<strong>The</strong>re is a FreeBSD Port<strong>The</strong>re is a Mercurial development repositoryM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 14 / 166


Cloning <strong>PETSc</strong>How can I get <strong>PETSc</strong>?<strong>The</strong> full development repository is open to the publichttp://mercurial.mcs.anl.gov/petsc/petsc-devhttp://mercurial.mcs.anl.gov/petsc/BuildSystemWhy is this better?You can clone to any release (or any specific ChangeSet)You can easily rollback changes (or releases)You can get fixes from us the same dayM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 15 / 166


How can I get <strong>PETSc</strong>?Unpacking <strong>PETSc</strong>Just clone development repositoryhg clone http://mercurial.mcs.anl.gov/petsc/petsc-devpetsc-devhg clone -rRelease-2.3.1 petsc-dev petsc-2.3.1orUnpack the tarballtar xzf petsc.tar.gzM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 16 / 166


Exercise 1How can I get <strong>PETSc</strong>?Download and Unpack <strong>PETSc</strong>!M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 17 / 166


How do I Configure <strong>PETSc</strong>?Configuring <strong>PETSc</strong>Set $PETSC DIR to the installtion root directoryRun the configuration utility$PETSC DIR/config/configure.py$PETSC DIR/config/configure.py --help$PETSC DIR/config/configure.py --download-mpich<strong>The</strong>re are many examples on the installation pageConfiguration files are placed in $PETSC DIR/bmake/$PETSC ARCH$PETSC ARCH has a default if not specifiedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 18 / 166


How do I Configure <strong>PETSc</strong>?Configuring <strong>PETSc</strong>You can easily reconfigure with the same options./bmake/$PETSC ARCH/configure.pyCan maintain several different configurations./config/configure.py -PETSC ARCH=linux-fast--with-debugging=0All configuration information is in configure.logALWAYS send this file with bug reportsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 19 / 166


How do I Configure <strong>PETSc</strong>?Automatic DownloadsStarting in 2.2.1, some packages are automaticallyDownloadedConfigured and Built (in $PETSC DIR/externalpackages)Installed in <strong>PETSc</strong>Currently works for<strong>PETSc</strong> documentation utilities (Sowing, lgrind, c2html)BLAS, LAPACK, BLACS, ScaLAPACK, PLAPACKMPICH, MPE, LAMParMetis, Chaco, Jostle, Party, ScotchMUMPS, Spooles, SuperLU, SuperLU Dist, UMFPackPrometheus, HYPRE, ML, SPAISundialsTriangle, TetGenM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 20 / 166


Exercise 2How do I Configure <strong>PETSc</strong>?Configure the <strong>PETSc</strong> that you downloaded and unpacked.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 21 / 166


Building <strong>PETSc</strong>How do I Build <strong>PETSc</strong>?Uses recursive make starting in cd $PETSC DIRmakeCheck build when done with make testComplete log for each build in make log $PETSC ARCHALWAYS send this with bug reportsCan build multiple configurationsPETSC ARCH=linux-fast makeLibraries are in $PETSC DIR/lib/$PETSC ARCH/Can also build a subtreecd src/snes; makecd src/snes; make ACTION=libfast treeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 22 / 166


Exercise 3How do I Build <strong>PETSc</strong>?Build the <strong>PETSc</strong> that you configured.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 23 / 166


Exercise 4How do I Build <strong>PETSc</strong>?Reconfigure <strong>PETSc</strong> to use ParMetis.1 ./bmake/linux-gnu/configure.py-PETSC ARCH=linux-parmetis-download-parmetis2 PETSC ARCH=linux-parmetis make3 PETSC ARCH=linux-parmetis make testM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 24 / 166


How do I run an example?Running <strong>PETSc</strong>Try running <strong>PETSc</strong> examples firstcd $PETSC DIR/src/snes/examples/tutorialsBuild examples using make targetsmake ex5Run examples using the make targetmake runex5Can also run using MPI directlympirun ./ex5 -snes max it 5mpiexec ./ex5 -snes monitorM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 25 / 166


Using MPIHow do I run an example?<strong>The</strong> Message Passing Interface is:a library for parallel communicationa system for launching parallel jobs (mpirun/mpiexec)a community standardLaunching jobs is easympiexec -np 4 ./ex5You should never have to make MPI calls when using <strong>PETSc</strong>Almost neverM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 26 / 166


MPI ConceptsHow do I run an example?CommunicatorA context (or scope) for parallel communication (“Who can I talk to”)<strong>The</strong>re are two defaults:yourself (PETSC COMM SELF),and everyone launched (PETSC COMM WORLD)Can create new communicators by splitting existing onesEvery <strong>PETSc</strong> object has a communicatorPoint-to-point communicationHappens between two processes (like in MatMult())Reduction or scan operationsHappens among all processes (like in VecDot())M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 27 / 166


How do I run an example?Alternative Memory ModelsSingle process (address space) modelOpenMP and threads in generalFortran 90/95 and compiler-discovered parallelismSystem manages memory and (usually) thread schedulingNamed variables refer to the same storageSingle name space modelHPF, UPCGlobal ArraysTitaniumNamed variables refer to the coherent values (distribution is automatic)Distributed memory (shared nothing)Message passingNames variables in different processes are unrelatedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 28 / 166


How do I run an example?Common Viewing OptionsGives a text representation-vec viewGenerally views subobjects too-snes viewCan visualize some objects-mat view drawAlternative formats-vec view binary, -vec view matlab, -vec view socketSometimes provides extra information-mat view info, -mat view info detailedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 29 / 166


How do I run an example?Common Monitoring OptionsDisplay the residual-ksp monitor, graphically -ksp xmonitorCan disable dynamically-ksp cancelmonitorsDoes not display subsolvers-snes monitorCan use the true residual-ksp truemonitorCan display different subobjects-snes vecmonitor, -snes vecmonitor update,-snes vecmonitor residual-ksp gmres krylov monitorCan display the spectrum-ksp singmonitorM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 30 / 166


Exercise 5How do I run an example?Run SNES Example 5 using come custom options.1 cd $PETSC DIR/src/snes/examples/tutorials2 make ex53 mpiexec ./ex5 -snes monitor -snes view4 mpiexec ./ex5 -snes type tr -snes monitor -snes view5 mpiexec ./ex5 -ksp monitor -snes monitor -snes view6 mpiexec ./ex5 -pc type jacobi -ksp monitor -snes monitor-snes view7 mpiexec ./ex5 -ksp type bicg -ksp monitor -snes monitor-snes viewM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 31 / 166


Exercise 6How do I run an example?Create a new code based upon SNES Example 5.1 Create a new directorymkdir -p /home/knepley/proj/newsim/src2 Copy the sourcecp ex5.c /home/knepley/proj/newsim/src3 Create a <strong>PETSc</strong> makefileAdd a link target${CLINKER} -o $@ $^ ${PETSC SNES LIB}${FLINKER} -o $@ $^ ${PETSC FORTRAN LIB} ${PETSC SNES LIB}include ${PETSC DIR}/bmake/common/baseM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 32 / 166


How do I get more help?Getting More Helphttp://www.mcs.anl.gov/petscHyperlinked documentationManualManual pages for evey methodHTML of all example code (linked to manual pages)FAQFull support at petsc-maint@mcs.anl.govHigh profile usersDavid KeyesRich MartineauRichard KatzCharles WilliamsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 33 / 166


Part IICommon <strong>PETSc</strong> UsageM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 34 / 166


Debugging <strong>PETSc</strong>Correctness DebuggingAutomatic generation of tracebacksDetecting memory corruption and leaksOptional user-defined error handlersM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 35 / 166


Debugging <strong>PETSc</strong>Interacting with the DebuggerLaunch the debugger-start in debugger [gdb,dbx,noxterm]-on error attach debugger [gdb,dbx,noxterm]Attach the debugger only to some parallel processes-debugger nodes 0,1Set the display (often necessary on a cluster)-display khan.mcs.anl.gov:0.0M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 36 / 166


Debugging TipsDebugging <strong>PETSc</strong>Putting a breakpoint in PetscError() can catch errors as they occur<strong>PETSc</strong> tracks memory overwrites at the beginning and end of arrays<strong>The</strong> CHKMEMQ macro causes a check of all allocated memoryTrack memory overwrites by bracketing them with CHKMEMQ<strong>PETSc</strong> checks for leaked memoryUse PetscMalloc() and PetscFree() for all allocationOption -trmalloc will print unfreed memory on PetscFinalize()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 37 / 166


Exercise 1Debugging <strong>PETSc</strong>Use the debugger to find a SEGVLocate a memory overwrite using CHKMEMQ.Get the examplehg clone -r1.4bk://mercurial.mcs.anl.gov/petsc/tutorialExercise1Build the example makeRun it make run and watch the fireworksRun it under the debugger make debug and correct the errorBuild it and run again make ex1 run to catch the memory overwriteCorrect the error, build it and run again make ex1 runM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 38 / 166


Profiling <strong>PETSc</strong>Performance Debugging<strong>PETSc</strong> has integrated profilingOption -log summary prints a report on PetscFinalize()<strong>PETSc</strong> allows user-defined eventsEvents report time, calls, flops, communication, etc.Memory usage is tracked by objectProfiling is separated into stagesEvent statistics are aggregated by stageM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 39 / 166


Profiling <strong>PETSc</strong>Using Stages and EventsUse PetscLogStageRegister() to create a new stageStages are identifier by an integer handleUse PetscLogStagePush/Pop() to manage stagesStages may be nested and will aggregate in a nested fashionUse PetscLogEventRegister() to create a new stageEvents also have an associated classUse PetscLogEventBegin/End() to manage eventsEvents may also be nested and will aggregate in a nested fashionCan use PetscLogFlops() to log user flopsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 40 / 166


Profiling <strong>PETSc</strong>Adding A Logging Stageint stageNum;ierr = PetscLogStageRegister(&stageNum, "name");CHKERRQ(ierr);ierr = PetscLogStagePush(stageNum);CHKERRQ(ierr);Code to Monitorierr = PetscLogStagePop();CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 41 / 166


Profiling <strong>PETSc</strong>Adding A Logging Eventstatic int USER EVENT;ierr = PetscLogEventRegister(&USER EVENT, "name", CLASS COOKIE);CHierr = PetscLogEventBegin(USER EVENT,0,0,0,0);CHKERRQ(ierr);Code to Monitorierr = PetscLogFlops(user event flops);CHKERRQ(ierr);ierr = PetscLogEventEnd(USER EVENT,0,0,0,0);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 42 / 166


Profiling <strong>PETSc</strong>Adding A Logging Classstatic int CLASS COOKIE;ierr = PetscLogClassRegister(&CLASS COOKIE,"name");CHKERRQ(ierr);Cookie identifies a class uniquelyInitialization must happen before any objects of this type are createdM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 43 / 166


Profiling <strong>PETSc</strong>Matrix Memory Preallocation<strong>PETSc</strong> sparse matrices are dynamic data structurescan add additional nonzeros freelyDynamically adding many nonzerosrequires additional memory allocationsrequires copiescan kill performanceMemory preallocation providesthe freedom of dynamic data structuresgood performanceM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 44 / 166


Profiling <strong>PETSc</strong>Efficient Matrix CreationCreate matrix with MatCreate()Set type with MatSetType()Determine the number of nonzeros in each rowloop over the grid for finite differencesloop over the elements for finite elementsneed only local+ghost informationPreallocate matrixMatSeqAIJSetPreallocation()MatMPIAIJSetPreallocation()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 45 / 166


Profiling <strong>PETSc</strong>Indicating Expected NonzerosSequential Sparse MatricesMatSeqAIJPreallocation(Mat A, int nz, int nnz[])nz: expected number of nonzeros in any rownz(i): expected number of nonzeros in row iM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 46 / 166


Profiling <strong>PETSc</strong>ParallelSparseMatrixEach process locally owns a submatrix of contiguous global rowsEach submatrix consists of diagonal and off-diagonal partsMatGetOwnershipRange(Mat A, int *start, int *end)start: first locally owned row of global matrixend-1: last locally owned row of global matrixM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 47 / 166


Profiling <strong>PETSc</strong>Indicating Expected NonzerosParallel Sparse MatricesMatMPIAIJPreallocation(Mat A, int dnz, int dnnz[], int onz,int onnz[])dnz: expected number of nonzeros in any row in the diagonal blocknz(i): expected number of nonzeros in row i in the diagonal blockonz: expected number of nonzeros in any row in the offdiagonal portionnz(i): expected number of nonzeros in row i in the offdiagonal portionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 48 / 166


Profiling <strong>PETSc</strong>Verifying PreallocationUse runtime option -infoOutput:[proc #] Matrix size: %d X %d; storage space: %dunneeded, %d used[proc #] Number of mallocs during MatSetValues( ) is %dM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 49 / 166


Exercise 2Profiling <strong>PETSc</strong>Return to Part 1:Execise 6 and add more profiling.Run it make profile and look at the profiling reportAdd a new stage for setupAdd a new event for FormInitialGuess() and log the flopsRun it again make ex5 profile and look at the profiling reportM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 50 / 166


Part III<strong>PETSc</strong> EssentialsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 51 / 166


<strong>PETSc</strong> principles and design<strong>PETSc</strong> StructureM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 52 / 166


<strong>PETSc</strong> principles and designFlow Control for a <strong>PETSc</strong> ApplicationMain RoutineTimestepping Solvers (TS)Nonlinear Solvers (SNES)Linear Solvers (KSP)Preconditioners (PC)<strong>PETSc</strong>ApplicationInitializationFunctionEvaluationJacobianEvaluationPostprocessingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 53 / 166


<strong>PETSc</strong> principles and designLevels of AbstractionIn Mathematical SoftwareApplication-specific interfaceProgrammer manipulates objects associated with the applicationHigh-level mathematics interfaceProgrammer manipulates mathematical objectsWeak forms, boundary conditions, meshesAlgorithmic and discrete mathematics interfaceProgrammer manipulates mathematical objectsSparse matrices, nonlinear equationsProgrammer manipulates algorithmic objectsSolversLow-level computational kernelsBLAS-type operations, FFTM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 54 / 166


<strong>PETSc</strong> principles and designObject-Oriented DesignDesign based on operations you perform,not on the data in the objectExample: A vector isnot a 1d array of numbersan object allowing addition and scalar multiplication<strong>The</strong> efficient use of the computer is an added difficultyM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 55 / 166


<strong>PETSc</strong> principles and designSymmetry PrincipleInterfaces to mutable data must be symmetric.Creation and query interfaces are paired“No get without a set”Fairness“If you can do it, your users will want to do it”Openness“If you can do it, your users will want to undo it”M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 56 / 166


<strong>PETSc</strong> principles and designEmpiricism PrincipleInterfaces must allow easy testing and comparison.Swapping different implementations“You will not be smart enough to pick the solver”Commonly violated in FE codeElements are hard codedAlso avoid assuming structure outside of the interfaceMaking continuous fields have discrete structureTemptation to put metadata in a different placesM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 57 / 166


<strong>PETSc</strong> principles and designExperimentation is Essential!Proof is not currrently enough to examine solversN. M. Nachtigal, S. C. Reddy, and L. N. Trefethen,How fast are nonsymmetric matrix iterations?, SIAMJ. Matrix Anal. Appl., 13, pp.778–795, 1992.Anne Greenbaum, Vlastimil Ptak, and ZdenekStrakos, Any Nonincreasing Convergence Curve isPossible for GMRES, SIAM J. Matrix Anal. Appl., 17(3), pp.465–469, 1996.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 58 / 166


<strong>PETSc</strong> principles and design<strong>The</strong> <strong>PETSc</strong> Programming ModelGoalsPortable, runs everywhereHigh performanceScalable parallelismApproachDistributed memory (“shared-nothing”)No special compilerAccess to data on remote machines through MPIHide within objects the details of the communicationUser orchestrates communication at a higher abstract levelM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 59 / 166


Collectivity<strong>PETSc</strong> principles and designMPI communicators (MPI Comm) specify collectivityProcesses involved in a computationConstructors are collective over a communicatorVecCreate(MPI Comm comm, Vec *x)Use PETSC COMM WORLD for all processes and PETSC COMM SELF for oneSome operations are collective, while others are notcollective: VecNorm()not collective: VecGetLocalSize()Sequences of collective calls must be in the same order on eachprocessM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 60 / 166


<strong>PETSc</strong> principles and designWhat is not in <strong>PETSc</strong>?Higher level representations of PDEsUnstructured mesh generation and manipulationDiscretizations, DealII<strong>PETSc</strong>-CS and SundanceLoad balancingSophisticated visualization capabilitiesMayaViEigenvaluesSLEPc and SIPOptimization and sensitivityTAO and VeltistoM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 61 / 166


<strong>PETSc</strong> principles and designBasic PetscObject UsageEvery object in <strong>PETSc</strong> supports a basic interfaceFunction OperationCreate() create the objectGet/SetName() name the objectGet/SetType() set the implementation typeGet/SetOptionsPrefix() set the prefix for all optionsSetFromOptions() customize object from the command lineSetUp() preform other initializationView() view the objectDestroy() cleanup object allocationAlso, all objects support the -help option.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 62 / 166


Part IV<strong>PETSc</strong> IntegrationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 63 / 166


Initial OperationsApplication IntegrationBe willing to experiment with algorithmsNo optimality without interplay between physics and algorithmicsAdopt flexible, extensible programmingAlgorithms and data structures not hardwiredBe willing to play with the real codeToy models are rarely helpfulIf possible, profile before upgrading or seeking helpAutomatic in <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 64 / 166


<strong>PETSc</strong> IntegrationInitial Operations<strong>PETSc</strong> is a set a library interfacesWe do not seize main()We do not control outputWe propagate errors from underlying packagesWe present the same interfaces in:CC++F77F90PythonSee Gropp in SIAM, OO Methods for Interop SciEng, ’99M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 65 / 166


Integration StagesInitial OperationsVersion ControlIt is impossible to overemphasizeInitializationLinking to <strong>PETSc</strong>ProfilingProfile before changingAlso incorporate command line processingLinear AlgebraFirst <strong>PETSc</strong> data structuresSolversVery easy after linear algebra is integratedM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 66 / 166


InitializationInitial OperationsCall PetscInitialize()Setup static data and servicesSetup MPI if it is not alreadyCall PetscFinalize()Calculates logging summaryShutdown and release resourcesChecks compile and linkM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 67 / 166


ProfilingInitial OperationsUse -log summary for a performance profileEvent timingEvent flopsMemory usageMPI messagesCall PetscLogStagePush() and PetscLogStagePop()User can add new stagesCall PetscLogEventBegin() and PetscLogEventEnd()User can add new eventsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 68 / 166


Initial OperationsCommand Line ProcessingCheck for an optionPetscOptionsHasName()Retrieve a valuePetscOptionsGetInt(), PetscOptionsGetIntArray()Set a valuePetscOptionsSetValue()Clear, alias, reject, etc.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 69 / 166


Vector AlgebraVector AlgebraWhat are <strong>PETSc</strong> vectors?Fundamental objects for storing field solutions, right-hand sides, etc.Each process locally owns a subvector of contiguous global dataHow do I create vectors?VecCreate(MPI Comm, Vec *)VecSetSizes(Vec, int n, int N)VecSetType(Vec, VecType typeName)VecSetFromOptions(Vec)Can set the type at runtimeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 70 / 166


Vector AlgebraVector AlgebraA <strong>PETSc</strong> VecHas a direct interface to the valuesSupports all vector space operationsVecDot(), VecNorm(), VecScale()Has unusual operations, e.g. VecSqrt(), VecWhichBetween()Communicates automatically during assemblyHas customizable communication (scatters)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 71 / 166


Creating a VectorVector AlgebraVec x;PetscInt N;PetscErrorCode ierr;ierr = PetscInitialize(&argc, &argv, PETSC NULL, PETSC NULL);CHKERRQierr = PetscOptionsGetInt(PETSC NULL, "-N", &N, PETSC NULL);CHKERierr = VecCreate(PETSC COMM WORLD, &x);CHKERRQ(ierr);ierr = VecSetSizes(x, PETSC DECIDE, N);CHKERRQ(ierr);ierr = VecSetType(x, "mpi");CHKERRQ(ierr);ierr = VecSetFromOptions(x);CHKERRQ(ierr);ierr = PetscFinalize();CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 72 / 166


Parallel AssemblyVectors and MatricesVector AlgebraProcesses may set an arbitrary entryMust use proper interfaceEntries need not be generated locallyLocal meaning the process on which they are stored<strong>PETSc</strong> automatically moves data if necessaryHappens during the assembly phaseM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 73 / 166


Vector AssemblyVector AlgebraA three step processEach process sets or adds valuesBegin communication to send values to the correct processComplete the communicationVecSetValues(Vec v, int n, int rows[], PetscScalarvalues[], mode)mode is either INSERT VALUES or ADD VALUESTwo phase assembly allows overlap of communication andcomputationVecAssemblyBegin(Vec v)VecAssemblyEnd(Vec v)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 74 / 166


Vector AlgebraOne Way to Set the Elements of a Vectorierr = VecGetSize(x, &N);CHKERRQ(ierr);ierr = MPI Comm rank(PETSC COMM WORLD, &rank);CHKERRQ(ierr);if (rank == 0) {for(i = 0, val = 0.0; i < N; i++, val += 10.0) {ierr = VecSetValues(x, 1, &i, &val, INSERT VALUES);CHKERRQ(ierr);}}/* <strong>The</strong>se routines ensure that the data is distributed to the other processes */ierr = VecAssemblyBegin(x);CHKERRQ(ierr);ierr = VecAssemblyEnd(x);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 75 / 166


Vector AlgebraA Better Way to Set the Elements of a Vectorierr = VecGetOwnershipRange(x, &low, &high);CHKERRQ(ierr);for(i = low, val = low*10.0; i < high; i++, val += 10.0) {ierr = VecSetValues(x, 1, &i, &val, INSERT VALUES);CHKERRQ(ierr);}/* <strong>The</strong>se routines ensure that the data is distributed to the other processes */ierr = VecAssemblyBegin(x);CHKERRQ(ierr);ierr = VecAssemblyEnd(x);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 76 / 166


Vector AlgebraSelected Vector OperationsFunction NameOperationVecAXPY(Vec y, PetscScalar a, Vec x)y = y + a ∗ xVecAYPX(Vec y, PetscScalar a, Vec x)y = x + a ∗ yVecWAYPX(Vec w, PetscScalar a, Vec x, Vec y) w = y + a ∗ xVecScale(Vec x, PetscScalar a)x = a ∗ xVecCopy(Vec y, Vec x)y = xVecPointwiseMult(Vec w, Vec x, Vec y)w i = x i ∗ y iVecMax(Vec x, PetscInt *idx, PetscScalar *r) r = maxr iVecShift(Vec x, PetscScalar r)x i = x i + rVecAbs(Vec x) x i = |x i |VecNorm(Vec x, NormType type, PetscReal *r) r = ||x||M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 77 / 166


Vector AlgebraWorking With Local VectorsIt is sometimes more efficient to directly access local storage of a Vec.<strong>PETSc</strong> allows you to access the local storage withVecGetArray(Vec, double *[])You must return the array to <strong>PETSc</strong> when you finishVecRestoreArray(Vec, double *[])Allows <strong>PETSc</strong> to handle data structure conversionsCommonly, these routines are inexpensive and do not involve a copyM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 78 / 166


VecGetArray in CVector AlgebraVec v;PetscScalar *array;PetscInt n, i;PetscErrorCode ierr;ierr = VecGetArray(v, &array);CHKERRQ(ierr);ierr = VecGetLocalSize(v, &n);CHKERRQ(ierr);ierr = PetscSynchronizedPrintf(PETSC COMM WORLD,"First element of local array is %f\n", array[0]);CHKERRQ(ierr);ierr = PetscSynchronizedFlush(PETSC COMM WORLD);CHKERRQ(ierr);for(i = 0; i < n; i++) {array[i] += (PetscScalar) rank;}ierr = VecRestoreArray(v, &array);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 79 / 166


VecGetArray in F77Vector Algebra#include "petsc.h"#include "petscvec.h"Vec v;PetscScalar array(1)PetscOffset offsetPetscInt n, iPetscErrorCode ierrcall VecGetArray(v, array, offset, ierr)call VecGetLocalSize(v, n, ierr)do i=1,narray(i+offset) = array(i+offset) + rankend docall VecRestoreArray(v, array, offset, ierr)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 80 / 166


VecGetArray in F90Vector Algebra#include "petsc.h"#include "petscvec.h"Vec v;PetscScalar pointer :: array(:)PetscInt n, iPetscErrorCode ierrcall VecGetArrayF90(v, array, ierr)call VecGetLocalSize(v, n, ierr)do i=1,narray(i) = array(i) + rankend docall VecRestoreArrayF90(v, array, ierr)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 81 / 166


Matrix AlgebraMatrix AlgebraWhat are <strong>PETSc</strong> matrices?Fundamental objects for storing stiffness matrices and JacobiansEach process locally owns a contiguous set of rowsSupports many data typesAIJ, Block AIJ, Symmetric AIJ, Block Diagonal, etc.Supports structures for many packagesMUMPS, Spooles, SuperLU, UMFPack, DSCPackM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 82 / 166


Matrix AlgebraHow do I create matrices?MatCreate(MPI Comm, Mat *)MatSetSizes(Mat, int m, int n, int M, int N)MatSetType(Mat, MatType typeName)MatSetFromOptions(Mat)Can set the type at runtimeMatSetValues(Mat,...)MUST be used, but does automatic communicationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 83 / 166


Matrix AlgebraMatrix Polymorphism<strong>The</strong> <strong>PETSc</strong> Mat has a single user interface,Matrix assemblyMatSetValues()Matrix-vector multiplicationMatMult()Matrix viewingMatView()but multiple underlying implementations.AIJ, Block AIJ, Symmetric Block AIJ,DenseMatrix-Freeetc.A matrix is defined by its interface, not by its data structure.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 84 / 166


Matrix AssemblyMatrix AlgebraA three step processEach process sets or adds valuesBegin communication to send values to the correct processComplete the communicationMatSetValues(Mat m, m, rows[], n, cols[], values[],mode)mode is either INSERT VALUES or ADD VALUESLogically dense block of valuesTwo phase assembly allows overlap of communication andcomputationMatAssemblyBegin(Mat m, type)MatAssemblyEnd(Mat m, type)type is either MAT FLUSH ASSEMBLY or MAT FINAL ASSEMBLYM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 85 / 166


Matrix AlgebraOne Way to Set the Elements of a MatrixSimple 3-point stencil for 1D Laplacianvalues[0] = −1.0; values[1] = 2.0; values[2] = −1.0;if (rank == 0) { /* Only one process creates matrix */for(row = 0; row < N; row++) {cols[0] = row−1; cols[1] = row; cols[2] = row+1;if (row == 0) {ierr = MatSetValues(A, 1, &row, 2, &cols[1], &values[1], INSERT VALUE} else if (row == N−1) {ierr = MatSetValues(A, 1, &row, 2, cols, values, INSERT VALUES);CHK} else {ierr = MatSetValues(A, 1, &row, 3, cols, values, INSERT VALUES);CHK}}}ierr = MatAssemblyBegin(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);ierr = MatAssemblyEnd(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 86 / 166


Matrix AlgebraParallelSparseMatrixEach process locally owns a submatrix of contiguous global rowsEach submatrix consists of diagonal and off-diagonal partsMatGetOwnershipRange(Mat A, int *start, int *end)start: first locally owned row of global matrixend-1: last locally owned row of global matrixM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 87 / 166


Matrix AlgebraA Better Way to Set the Elements of a MatrixSimple 3-point stencil for 1D Laplacianvalues[0] = −1.0; values[1] = 2.0; values[2] = −1.0;for(row = start; row < end; row++) {cols[0] = row−1; cols[1] = row; cols[2] = row+1;if (row == 0) {ierr = MatSetValues(A, 1, &row, 2, &cols[1], &values[1], INSERT VALUES} else if (row == N−1) {ierr = MatSetValues(A, 1, &row, 2, cols, values, INSERT VALUES);CHKE} else {ierr = MatSetValues(A, 1, &row, 3, cols, values, INSERT VALUES);CHKE}}ierr = MatAssemblyBegin(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);ierr = MatAssemblyEnd(A, MAT FINAL ASSEMBLY);CHKERRQ(ierr);M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 88 / 166


Matrix AlgebraWhy Are <strong>PETSc</strong> Matrices That Way?No one data structure is appropriate for all problemsBlocked and diagonal formats provide significant performance benefits<strong>PETSc</strong> has many formats and makes it easy to add new data structuresAssembly is difficult enough without worrying about partitioning<strong>PETSc</strong> provides parallel assembly routinesAchieving high performance still requires making most operations localHowever, programs can be incrementally developed.Matrix decomposition in contiguous chunks is simpleMakes interoperation with other codes easierFor other ordering, <strong>PETSc</strong> provides “Application Orderings” (AO)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 89 / 166


SolversAlgebraic SolvesExplicit:Field variables are updated using local neighbor informationSemi-implicit:Some subsets of variables are updated with global solvesOthers with direct local updatesImplicit:Most or all variables are updated in a single global solveM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 90 / 166


Linear SolversKrylov MethodsAlgebraic SolvesUsing <strong>PETSc</strong> linear algebra, just add:KSPSetOperators(KSP ksp, Mat A, Mat M, MatStructureflag)KSPSolve(KSP ksp, Vec b, Vec x)Can access subobjectsKSPGetPC(KSP ksp, PC *pc)Preconditioners must obey <strong>PETSc</strong> interfaceBasically just the KSP interfaceCan change solver dynamically from the command line, -ksp typeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 91 / 166


Nonlinear SolversNewton MethodsAlgebraic SolvesUsing <strong>PETSc</strong> linear algebra, just add:SNESSetFunction(SNES snes, Vec r, residualFunc, void*ctx)SNESSetJacobian(SNES snes, Mat A, Mat M, jacFunc, void*ctx)SNESSolve(SNES snes, Vec b, Vec x)Can access subobjectsSNESGetKSP(SNES snes, KSP *ksp)Can customize subobjects from the cmd lineSet the subdomain preconditioner to ILU with -sub pc type iluM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 92 / 166


Basic Solver UsageAlgebraic SolvesWe will illustrate basic solver usage with SNES.Use SNESSetFromOptions() so that everything is set dynamicallyUse -snes type to set the type or take the defaultOverride the tolerancesUse -snes rtol and -snes atolView the solver to make sure you have the one you expectUse -snes viewFor debugging, monitor the residual decreaseUse -snes monitorUse -ksp monitor to see the underlying linear solverM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 93 / 166


Algebraic Solves3rd Party Solvers in <strong>PETSc</strong>1 Sequential LUILUDT (SPARSEKIT2, Yousef Saad, U of MN)EUCLID & PILUT (Hypre, David Hysom, LLNL)ESSL (IBM)SuperLU (Jim Demmel and Sherry Li, LBNL)MatlabUMFPACK (Tim Davis, U. of Florida)LUSOL (MINOS, Michael Saunders, Stanford)2 Parallel LUMUMPS (Patrick Amestoy, IRIT)SPOOLES (Cleve Ashcroft, Boeing)SuperLU Dist (Jim Demmel and Sherry Li, LBNL)3 Parallel CholeskyDSCPACK (Padma Raghavan, Penn. State)4 XYTlib - parallel direct solver (Paul Fischer and Henry Tufo, ANL)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 94 / 166


Algebraic Solves3rd Party Preconditioners in <strong>PETSc</strong>1 Parallel ICCBlockSolve95 (Mark Jones and Paul Plassman, ANL)2 Parallel ILUBlockSolve95 (Mark Jones and Paul Plassman, ANL)3 Parallel Sparse Approximate InverseParasails (Hypre, Edmund Chow, LLNL)SPAI 3.0 (Marcus Grote and Barnard, NYU)4 Sequential Algebraic MultigridRAMG (John Ruge and Klaus Steuben, GMD)SAMG (Klaus Steuben, GMD)5 Parallel Algebraic MultigridPrometheus (Mark Adams, PPPL)BoomerAMG (Hypre, LLNL)ML (Trilinos, Ray Tuminaro and Jonathan Hu, SNL)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 95 / 166


More AbstractionsHigher Level Abstractions<strong>The</strong> <strong>PETSc</strong> DA class is a topology interface.Structured grid interfaceFixed simple topologySupports stencils, communication, reorderingNo idea of operatorsNice for simple finite differences<strong>The</strong> <strong>PETSc</strong> DM class is a hierarchy interface.Supports multigridDMMG combines it with the MG preconditionerAbstracts the logic of multilevel methodsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 96 / 166


More Abstractions3 Ways To Use <strong>PETSc</strong>User manages all topology (just use Vec and Mat)<strong>PETSc</strong> manages single topology (use DA)<strong>PETSc</strong> manages a hierarchy (use DM)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 97 / 166


Part VAdvanced <strong>PETSc</strong>M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 98 / 166


SNESCallbacksNonlinear Equations<strong>The</strong> SNES interface is based upon callback functionsSNESSetFunction()SNESSetJacobian()When <strong>PETSc</strong> needs to evaluate the nonlinear residual F (x), the solvercalls the user’s function inside the application.<strong>The</strong> user function get application state through the ctx variable. <strong>PETSc</strong>never sees application data.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 99 / 166


SNES FunctionNonlinear Equations<strong>The</strong> user provided function which calculates the nonlinear residual hassignaturePetscErrorCode (*func)(SNES snes, Vec x, Vec r, void *ctx)x: <strong>The</strong> current solutionr: <strong>The</strong> residualctx: <strong>The</strong> user context passed to SNESSetFunction()Use this to pass application information, e.g. physical constantsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 100 / 166


SNES JacobianNonlinear Equations<strong>The</strong> user provided function which calculates the Jacobian has signaturePetscErrorCode (*func)(SNES snes, Vec x, Mat *J, Mat *M,MatStructure *flag, void *ctx)x: <strong>The</strong> current solutionJ: <strong>The</strong> JacobianM: <strong>The</strong> Jacobian preconditioning matrix (possibly J itself)ctx: <strong>The</strong> user context passed to SNESSetFunction()Use this to pass application information, e.g. physical constantsPossible MatrStructure values are:SAME NONZERO PATTERN, DIFFERENT NONZERO PATTERN,. . .Alternatively, you can usea builtin sparse finite difference approximationautomatic differentiationAD support via ADIC/ADIFOR (P. Hovland and B. Norris from ANL)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 101 / 166


SNES VariantsNonlinear EquationsLine search strategiesTrust region approachesPseudo-transient continuationMatrix-free variantsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 102 / 166


Nonlinear EquationsFinite Difference Jacobians<strong>PETSc</strong> can compute and explicitly store a Jacobian via 1st-order FDDenseActivated by -snes fdComputed by SNESDefaultComputeJacobian()Sparse via coloringsColoring is created by MatFDColoringCreate()Computed by SNESDefaultComputeJacobianColor()Can also use Matrix-free Newton-Krylov via 1st-order FDActivated by -snes mf without preconditioningActivated by -snes mf operator with user-defined preconditioningUses preconditioning matrix from SNESSetJacobian()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 103 / 166


Nonlinear EquationsSNES Example: Driven CavityVelocity-vorticity formulationFlow driven by lid and/or bouyancyLogically regular gridParallelized with DAFinite difference discretizationAuthored by David Keyes$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 104 / 166


Nonlinear EquationsDriven Cavity Application Contexttypedef struct {/*—– basic application data —–*/double lid velocity; /* Velocity of the lid */double prandtl, grashof; /* Prandtl and Grashof numbers */int mx, my; /* Grid points in x and y */int mc; /* Degrees of freedom per node */PetscTruth draw contours; /* Flag for drawing contours *//*—– parallel data —–*/MPI Comm comm; /* Communicator */DA da; /* Distributed array */Vec localX, localF; /* Local ghosted solution and residual */} AppCtx;$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 105 / 166


Nonlinear EquationsDriven Cavity Residual EvaluationDrivenCavityFunction(SNES snes, Vec X, Vec F, void *ptr){AppCtx *user = (AppCtx *) ptr;int istart, iend, jstart, jend; /* local starting and ending grid pointPetscScalar *f; /* local vector data */PetscReal grashof = user−>grashof;PetscReal prandtl = user−>prandtl;PetscErrorCode ierr;/* Not Shown: Code to communicate nonlocal ghost point data (scatters) */ierr = VecGetArray(F, &f); CHKERRQ(ierr);/* Not Shown: Code to compute local function components */ierr = VecRestoreArray(F, &f); CHKERRQ(ierr);return 0;}$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 106 / 166


Nonlinear EquationsBetter Driven Cavity Residual EvaluationPetscErrorCode DrivenCavityFuncLocal(DALocalInfo *info,Field **x,Field **f,v{/* Not Shown: Handle boundaries *//* Compute over the interior points */for(j = info−>ys; j < info−>xs+info−>xm; j++) {for(i = info−>xs; i < info−>ys+info−>ym; i++) {/* Not Shown: convective coefficients for upwinding *//* U velocity */u = x[j][i].u;uxx = (2.0*u − x[j][i−1].u − x[j][i+1].u)*hydhx;uyy = (2.0*u − x[j−1][i].u − x[j+1][i].u)*hxdhy;f[j][i].u = uxx + uyy − .5*(x[j+1][i].omega−x[j−1][i].omega)*hx;/* Not Shown: V velocity, Omega, Temperature */}}}$PETCS DIR/src/snes/examples/tutorials/ex19.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 107 / 166


What is a DA?Structured GridsDA is a topology interface handling parallel data layout on structured gridsHandles local and global indicesDAGetGlobalIndices() and DAGetAO()Provides local and global vectorsDAGetGlobalVector() and DAGetLocalVector()Handles ghost values coherenceDAGetGlobalToLocal() and DAGetLocalToGlobal()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 108 / 166


Creating a DAStructured GridsDACreate1d(comm, DAPeriodicType wrap, M, dof, s, lm[], DA*da)wrap: Specifies periodicityDA NONPERIODIC or DA XPERIODICM: Number of grid points in x-directiondof: Degrees of freedom per nodes: <strong>The</strong> stencil widthlm: Alternative array of local sizesUse PETSC NULL for the defaultM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 109 / 166


Creating a DAStructured GridsDACreate2d(comm, wrap, type, M, N, m, n, dof, s, lm[],ln[], DA *da)wrap: Specifies periodicityDA NONPERIODIC, DA XPERIODIC, DA YPERIODIC, or DA XYPERIODICtype: Specifies stencilDA STENCIL BOX or DA STENCIL STARM/N: Number of grid points in x/y-directionm/n: Number of processes in x/y-directiondof: Degrees of freedom per nodes: <strong>The</strong> stencil widthlm/n: Alternative array of local sizesUse PETSC NULL for the defaultM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 110 / 166


Ghost ValuesStructured GridsTo evaluate a local function f (x), each process requiresits local portion of the vector xits ghost values, bordering portions of x owned by neighboringprocessesLocal NodeGhost NodeM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 111 / 166


Structured GridsDA Global NumberingsProc 2 Proc 325 26 27 28 2920 21 22 23 2415 16 17 18 1910 11 12 13 145 6 7 8 90 1 2 3 4Proc 0 Proc 1Natural numberingProc 2 Proc 321 22 23 28 2918 19 20 26 2715 16 17 24 256 7 8 13 143 4 5 11 120 1 2 9 10Proc 0 Proc 1<strong>PETSc</strong> numberingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 112 / 166


Structured GridsDA Global vs. Local NumberingGlobal: Each vertex belongs to a unique process and has a unique idLocal: Numbering includes ghost vertices from neighboring processesProc 2 Proc 3X X X X XX X X X X12 13 14 15 X8 9 10 11 X4 5 6 7 X0 1 2 3 XProc 0 Proc 1Local numberingProc 2 Proc 321 22 23 28 2918 19 20 26 2715 16 17 24 256 7 8 13 143 4 5 11 120 1 2 9 10Proc 0 Proc 1Global numberingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 113 / 166


DA VectorsStructured Grids<strong>The</strong> DA object contains only layout (topology) informationAll field data is contained in <strong>PETSc</strong> VecsGlobal vectors are parallelEach process stores a unique local portionDACreateGlobalVector(DA da, Vec *gvec)Local vectors are sequential (and usually temporary)Each process stores its local portion plus ghost valuesDACreateLocalVector(DA da, Vec *lvec)includes ghost values!M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 114 / 166


Updating GhostsStructured GridsTwo-step process enables overlapping computation and communicationDAGlobalToLocalBegin(da, gvec, mode, lvec)gvec provides the datamode is either INSERT VALUES or ADD VALUESlvec holds the local and ghost valuesDAGlobalToLocal End(da, gvec, mode, lvec)Finishes the communication<strong>The</strong> process can be reversed with DALocalToGlobal().M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 115 / 166


DA Local FunctionStructured Grids<strong>The</strong> user provided function which calculates the nonlinear residual in 2Dhas signaturePetscErrorCode (*lfunc)(DALocalInfo *info, PetscScalar **x,PetscScalar **r, void *ctx)info: All layout and numbering informationx: <strong>The</strong> current solutionNotice that it is a multidimensional arrayr: <strong>The</strong> residualctx: <strong>The</strong> user context passed to DASetLocalFunction()<strong>The</strong> local DA function is activated by callingSNESSetFunction(snes, r, SNESDAFormFunction, ctx)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 116 / 166


DA Local JacobianStructured Grids<strong>The</strong> user provided function which calculates the nonlinear residual in 2Dhas signaturePetscErrorCode (*lfunc)(DALocalInfo *info, PetscScalar **x,Mat J, void *ctx)info: All layout and numbering informationx: <strong>The</strong> current solutionJ: <strong>The</strong> Jacobianctx: <strong>The</strong> user context passed to DASetLocalFunction()<strong>The</strong> local DA function is activated by callingSNESSetJacobian(snes, J, J, SNESDAComputeJacobian, ctx)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 117 / 166


DA StencilsStructured GridsBoth the box stencil and star stencil are available.proc 10proc 10proc 0 proc 1proc 0 proc 1Box StencilStar StencilM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 118 / 166


Structured GridsSetting Values for Finite Differences<strong>PETSc</strong> providesMatSetValuesStencil(Mat A, m, MatStencil idxm[], n,MatStencil idxn[], values[], mode)Each row or column is actually a MatStencilThis specifies grid coordinates and a component if necessaryCan imagine for unstructured grids, they are vertices<strong>The</strong> values are the same logically dense block in rows and columnsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 119 / 166


Structured GridsMapping Between Global NumberingsNatural global numberingConvenient for visualization, boundary conditions, etc.Convert between global numbering schemes using AODAGetAO(DA da, AO *ao)Handled automatically by some utilities (e.g., VecView()) for DAvectorsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 120 / 166


Structured GridsDA Example: BratuCreate SNES and DAUse DASetLocalFunction() and DASetLocalJacobian() to setuser callbacksUse DAGetMatrix() to get DA matrix for SNESUse SNESDAFormFunction() and SNESDAComputeJacobian() forSNES callbackCould also use FormFunctionMatlab()Could also use SNESDefaultComputeJacobian()$PETCS DIR/src/snes/examples/tutorials/ex5.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 121 / 166


Structured GridsDA Example: BratuPetscErrorCode LocalFunc(DALocalInfo *info, PetscScalar **x, PetscScalar **f{for(j = info−>ys; j < info−>ys + info−>ym; j++) {for(i = info−>xs; i < info−>xs + info−>xm; i++) {if (i == 0 | | j == 0 | | i == info−>mx−1 | | j == info−>my−1) {f[j][i] = x[j][i];} else {u = x[j][i];u xx = −(x[j][i+1] − 2.0*u + x[j][i−1])*(hy/hx);u yy = −(x[j+1][i] − 2.0*u + x[j−1][i])*(hx/hy);f[j][i] = u xx + u yy − hx*hy*lambda*PetscExpScalar(u);}}}}$PETCS DIR/src/snes/examples/tutorials/ex5.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 122 / 166


Structured GridsDA Example: Bratuint LocalJac(DALocalInfo *info, PetscScalar **x, Mat jac, void *ctx){for(j = info−>ys; j < info−>ys + info−>ym; j++) {for(i = info−>xs; i < info−>xs + info−>xm; i++) {row.j = j; row.i = i;if (i == 0 | | j == 0 | | i == info−>mx−1 | | j == info−>my−1) {v[0] = 1.0;ierr = MatSetValuesStencil(jac, 1, &row, 1, &row, v, INSERT VALUES} else {v[0] = −(hx/hy); col[0].j = j−1; col[0].i = i;v[1] = −(hy/hx); col[1].j = j; col[1].i = i−1;v[2] = 2.0*(hy/hx+hx/hy) − hx*hy*lambda*PetscExpScalar(x[j][i]);v[3] = −(hy/hx); col[3].j = j; col[3].i = i+1;v[4] = −(hx/hy); col[4].j = j+1; col[4].i = i;ierr = MatSetValuesStencil(jac, 1, &row, 5, col, v, INSERT VALUES);C} } } }$PETCS DIR/src/snes/examples/tutorials/ex5.cM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 123 / 166


Part VI<strong>PETSc</strong> ExtensibilityM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 124 / 166


Extending the BuildExtending ConfigureSee python/<strong>PETSc</strong>/packages/*.py for examplesAdd module with class Configure derived fromconfig.base.ConfigureCan also derive from <strong>PETSc</strong>.package.PackageImplement configure() and configureHelp()Customize <strong>PETSc</strong> through the make systemaddDefine()addTypedef(), addPrototype()addMakeMacro(), addMakeRule()Deprecated addSubstitution()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 125 / 166


Linking to <strong>PETSc</strong>Extending the BuildNothing but the librariesUser can custom linkUsing only <strong>PETSc</strong> build variablesInclude bmake/common/variablesAlso use <strong>PETSc</strong> build rulesInclude bmake/common/baseAlso makes available 3rd party packagesM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 126 / 166


Extending ClassesLayering over <strong>PETSc</strong>SLEPc, TAO, and MagParInfrastructure and linear algebraUse <strong>PETSc</strong> object structureDynamic dispatchUse dynamic linking facilitiesRuntime type selectionUse debugging and profiling toolsMemory management, runtime type checkingM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 127 / 166


Extending ClassesAdding an ImplementationSee src/ksp/pc/impls/jacobi/jacobi.cImplement the interface methodsFor Jacobi, PCSetUp(), PCApply(),...Define a constructorAllocate and initialize the class structureFill in the function tableMust have C linkageRegister the constructorSee src/ksp/ksp/interface/dlregis.cMaps a string (class name) to the constructorUsually uses PetscFListAdd()M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 128 / 166


Extending ClassesAdding a New WrapperSee src/ts/impls/implicit/pvode/petscpvode.cJust like an ImplementationMethods dispatch to 3rd party softwareNeed to alter local makefileAdd a requirespackage lineAdd include variable to CPPFLAGSUsually requires configure additionsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 129 / 166


Extending ClassesAdding a New SubtypeSee src/mat/impls/aij/seq/umfpack/umfpack.cHave to virtualize methods by handDefine a constructorChange type name first to correct nameCall MatSetType() for base typeReplace (and save) overidden methodsConstruct any specific dataMust also define a conversion to the base typeOnly called in destructor right nowM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 130 / 166


Extending ClassesAdding a New TypeSee src/ksp/ksp/kspimpl.hDefine a methods structure (interface)A list of function pointersDefine a type structureFirst member is PETSCHEADER(struct Ops)Possibly other data members common to the typeA void *data for implementation structuresM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 131 / 166


Extending ClassesAdding a New TypeSee src/ksp/ksp/interface/dlregis.cDefine a package initializer (PetscDLLibraryRegister)Called when the DLL is loadedAlso called from generic create if linking staticallyRegisters classes and events (see below)Registers any default implementation constructorsSetup any info or summary exclusionsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 132 / 166


Extending ClassesAdding a New TypeSee src/ksp/ksp/interface/itcreate.cDefine a generic createCall package initializer if linking staticallyCall PetscHeaderCreate()Initialize any default data membersDefine a setType() methodCall the destructor of any current implementationCall the constructor of the given implementationSet the type nameM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 133 / 166


Extending ClassesAdding a New TypeThings Swept Under <strong>The</strong> RugNeed setFromOptions() which allows implementation selection atruntimeHave to manage the database of registered constructorsView and destroy functions handled a little funny due to historicalreasonsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 134 / 166


Part VII<strong>PETSc</strong> OptimizationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 135 / 166


Marc Garbey’s ProblemProblem Domainρ=1ρ=100M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 136 / 166


Marc Garbey’s ProblemProblem StatementInhomogeneous Laplacian in 2D. Modeled by the partial differentialequation∇ · (ρ∇u) = f on Ω = [0, 1] × [0, 1],with forcing functionwith Dirichlet boundary conditionsf(x, y) = e −(1−x)2 /ν e −(1−y)2 /ν ,u = f(x, y) on ∂Ω,or homogeneous Neumman boundary conditionsˆn · ∇u = 0 on ∂Ω.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 137 / 166


Revision 1.1Marc Garbey’s ProblemFor the initial try, we modify a common <strong>PETSc</strong> DMMG skeleton:Use a simple FD 5-point stencil discretizationUse a structured grid (DA)Use a hierarchical method (DMMG)Only implement Dirichlet BC (simple masking)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 138 / 166


Revision 1.2Marc Garbey’s ProblemNow utilize some more <strong>PETSc</strong> features:Add UserContext structure to hold ν and the BC typeNeed to set the context at each DM levelAdd Neumann BC using a MatNullSpaceUsed to project onto the orthogonal complementKSPSetNullSpace()Set parameters from the command linePetscOptionsBegin(), PetscOptionsEnd()PetscOptionsScalar(), PetscOptionsString()By hand, PetscOptionsGetScalar(), PetscOptionsGetString()Fixed scaling for anisotropic gridsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 139 / 166


Revision 1.6Marc Garbey’s ProblemBarry fixed the example to converge nicely:Set nullspace on all DM levelsActually set in the smoother (KSP)Same idea as the user contextNow completely handled by DMMGSetNullSpace()Remove the null space component of the rhsMatNullSpaceRemove()Usually handled by the modelAdd a shift to the coarse grid LU for Neumann BCSystem is singular so augment with the identityOne extra step if coarse solve is redundantFix weighting for center point of Neumann conditionDepends on the number of missing pointsAlso use PetscOptionsElist() to set BCCan provide a nice listbox using the GUIM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 140 / 166


DMMG GridsMarc Garbey’s Problem<strong>The</strong> use specifies the coarse grid, and then DMMG successively refines it.In our problem, we being with a 3 × 3 gridWe LU factor a 9 × 9 matrixBy level 10, we have a 1025 × 1025 gridOur final solution has 1,050,625 unknownsSet the initial grid using -da grid x and -da grid ySet the number of levels using -dmmg nlevelsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 141 / 166


Marc Garbey’s ProblemMesh Independence<strong>The</strong> iteration number should be independent of the mesh size, or thenumber of levels.Levels Unknowns KSP Iterations2 25 24 289 26 4225 38 66049 210 1050625 2We have used Dirichlet BC aboveM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 142 / 166


Viewing the DAMarc Garbey’s Problem-da view outputs an ascii description-da view draw display the gridFirst the grid itself is drawnGlobal variable numbers are then providedFinally ghost variable numbers are shown for error checking-vec view draw draws a contour plot for DA vectors<strong>The</strong> contour grid can be shown with -draw contour gridM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 143 / 166


Marc Garbey’s ProblemForcing Function<strong>The</strong> rhs of our linear system drives the solution:M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 144 / 166


Marc Garbey’s ProblemDirichlet SolutionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 145 / 166


Marc Garbey’s ProblemNeumann SolutionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 146 / 166


Marc Garbey’s ProblemMultigrid OptionsChoose V-cycle or W-cycle using -pc mg cyclesCan set the iteration method using -pc mg typeMULTIPLICATIVE, ADDITIVE, FULL, KASKADEChoose the number of steps in both the up and down smoothers-pc mg smoothup, -pc mg smoothdown<strong>The</strong> coarse problem has prefix mg coarse-mg coarse pc type, -mg coarse ksp maxitEach level k has prefix mg levels k-mg levels 1 ksp type, -mg levels 2 pc ilu fillCan automatically form coarse operators with the Galerkin process-pc mg galerkinDMMG provides these automatically by interpolationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 147 / 166


Part VIIIFuture PlansM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 148 / 166


What is New?New classes and tools will be added to the existing <strong>PETSc</strong>framework:Unstructured mesh generation, refinement, and coarseningUnstructured multilevel algorithmsA high-level language for specification of weak forms (with FFC)Automatic generation of arbitrary finite elements (with FIAT)Platform independent, system for configure, build, anddistributionM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 149 / 166


<strong>PETSc</strong> PartsTo make the new functionality easier for users to understand, <strong>PETSc</strong> willbe divided conceptually into two parts:<strong>PETSc</strong>-AS will contain the components related to algebraic solvers,which is the current <strong>PETSc</strong> distribution.<strong>PETSc</strong>-CS will contain the support for continuum problems phrasedin weak form. <strong>The</strong>se modules will make use of the <strong>PETSc</strong>-ASmodules in our implementation, but this is not strictly necessary.However, the framework will remain integrated:Single release version, version control, and build systemEach module is self-contained (install only what is necessary)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 150 / 166


BuildSystemBuildPure Python and freely availableCurrently handles configure and buildneeds install and distributionHandles generated code through ASEcan uses md5, timestamp, diff, etc.Handles shared libraries for many architecturesLinux, Mac OSX, WindowsM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 151 / 166


Mesh CapabilitiesMesh2D Delaunay generation and refinementTriangle3D Delaunay generation and refinementTetGen3D Delaunay generation, coarsening, and refinementAllows quadratic Bezier elementsAllows efficient, dynamic mesh updateTUMBLEM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 152 / 166


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distribute


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundary


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsen


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh source


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh sourceoverlap, fusion


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh sourceoverlap, fusionAllow fine grain parallelism, and enable Field operations


How does Sieve Work?MeshApplication codes should only rarely use low-level operations.create, distributeCan create from a file or boundaryrefine, coarsenCan refine or coarsen no matter the mesh sourceoverlap, fusionAllow fine grain parallelism, and enable Field operationsImplemented by low-level Sieve operationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 153 / 166


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimension


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a mesh


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, update


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchy


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchyHeart of FEM, Multigrid, Domain Decomposition


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchyHeart of FEM, Multigrid, Domain DecompositiondistributeSection, distributeSieve


How does Field Work?MeshApplication codes should use Fields for any data organized over a meshsetChart, setFiberDimensionSetup aribtrary data layout over a meshrestrict, updateControl data flow between levels of a hierarchyHeart of FEM, Multigrid, Domain DecompositiondistributeSection, distributeSievePersistent communication structures for irregular dataM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 154 / 166


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independent


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3D


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independent


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quads


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independent


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independentSame assembly code works for any finite element


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independentSame assembly code works for any finite elementOnly variation is in local integration routine (tied to weak form)


AdvantagesMeshSieve/Field operations enable algorithms which are:Dimension independentSame partitioning code with in 1D, 2D, and 3DShape independentSame assembly code works for tets and quadsDiscretization independentSame assembly code works for any finite elementOnly variation is in local integration routine (tied to weak form)Transparent to parallelismM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 155 / 166


MeshWhat is Sieve Good For?Parallelization


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layout


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybrid


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy density


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologies


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torus


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systems


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material models


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patterns


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patternsAutomatic discovery (∆)


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patternsAutomatic discovery (∆)Arbitrary domains and data layout


MeshWhat is Sieve Good For?ParallelizationHierarchical, unstructured data layoutSolution fields, esp. higher order, adaptive, hybridAuxiliary fields, e.g. viscosity, energy densityComplex topologiesSphere, torusFault systemsBlock material modelsComplex communication patternsAutomatic discovery (∆)Arbitrary domains and data layoutPersistent communication structures (<strong>PETSc</strong> VecScatter)M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 156 / 166


IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy code


IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structures


IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLith


IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLithReorganization for new development


IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLithReorganization for new developmentBetter describe common FEM operations


IntegrationMeshTwo main integration paradigms for SieveParallelizaion of legacy codeModel existing data structuresExample: PyLithReorganization for new developmentBetter describe common FEM operationsExample: EqSim/PyLith integrationM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 157 / 166


PyLith in ParallelMeshPyLith can now handle moderately sized meshes in parallel:Mesh courtesy of Carl Gable, LANLM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 158 / 166


Sieve and <strong>PETSc</strong>MeshSieve is the unstructured mesh component of <strong>PETSc</strong>Wrapper obeys the DM semanticsWorks with <strong>PETSc</strong> Viewers


Sieve and <strong>PETSc</strong>MeshSieve is the unstructured mesh component of <strong>PETSc</strong>Wrapper obeys the DM semanticsWorks with <strong>PETSc</strong> ViewersSieve can be installed separately (almost)FEniCS Project component


Sieve and <strong>PETSc</strong>MeshSieve is the unstructured mesh component of <strong>PETSc</strong>Wrapper obeys the DM semanticsWorks with <strong>PETSc</strong> ViewersSieve can be installed separately (almost)FEniCS Project componentIncludes support for global, contiguous vectorsGlobal and local numberingsAutomatic, persistent scattersM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 159 / 166


for(iterator e iter = elements−>begin(); e iter != elements−>end(); ++e iter)ElementGeometry(mesh, *e iter, v0, Jac, PETSC NULL, &detJ);PetscMemzero(elementVec, NUM BASIS*sizeof(PetscScalar));for(int q = 0; q < NUM QUADRATURE POINTS; q++) {xi = points[q*2+0] + 1.0;eta = points[q*2+1] + 1.0;x q = Jac[0]*xi + Jac[1]*eta + v0[0];y q = Jac[2]*xi + Jac[3]*eta + v0[1];for(i = 0; i < NUM BASIS; i++) {elementVec[i] += Basis[q*NUM BASIS+i]*f(x q, y q)*weights[q]*detJ;}}field−>updateAdd(*e iter, elementVec);}M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 160 / 166Rhs EvaluationMeshconst patch type patch = 0;const Obj& elements = topology−>heightStratum(patch, 0);PetscScalarelementVec[NUM BASIS];


Jacobian EvaluationMeshPetscScalar elementMat[NUM BASIS*NUM BASIS];for(iterator e iter = elements−>begin(); e iter != elements−>end(); ++e iter)ElementGeometry(mesh, *e iter, v0, Jac, Jinv, &detJ);PetscMemzero(elementMat, NUM BASIS*NUM BASIS*sizeof(PetscScalar));for(int q = 0; q < NUM QUADRATURE POINTS; q++) {for(i = 0; i < NUM BASIS; i++) {testDer[0] = Jinv[0]*BasisDers[(q*NUM BASIS+i)*2+0] + Jinv[2]*BasisDtestDer[1] = Jinv[1]*BasisDers[(q*NUM BASIS+i)*2+0] + Jinv[3]*BasisDfor(j = 0; j < NUM BASIS; j++) {basisDer[0] = Jinv[0]*BasisDers[(q*NUM BASIS+j)*2+0] + Jinv[2]*BasbasisDer[1] = Jinv[1]*BasisDers[(q*NUM BASIS+j)*2+0] + Jinv[3]*BaselementMat[i*NUM BASIS+j] +=(testDer[0]*basisDer[0] + testDer[1]*basisDer[1])*weights[q]*detJ;}}}updateOperator(jac, field, globalOrder, *e iter, elementMat, ADD VALUES);}M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 161 / 166


FIAT IntegrationDiscretizationFinite Element Integrator and Tabulator by Rob KirbyIn order to generate a quadrature routines, we need:A differential form to integrateAn element (usually a family and degree) using FIATA quadrature ruleWe then produceA C integration routinesA Python module wrapperOptional Ferari optimized routinesOptional element assembly loopFIAT is part of the FEniCS project, as is the <strong>PETSc</strong> Sieve moduleM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 162 / 166


Weak Form LanguagesA Language for Weak FormsIn collaboration withAnders Logg of SimulaRob Kirby of Texas TechWe have developed a small language for weak forms, based directly on anAST representation.<strong>The</strong> Fenics Form Compiler (FFC) processes each form algebraically,allowing some simplification and optimization, before passing it on to theintegration generation routines.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 163 / 166


Input LanguageWeak Form LanguagesWe have a simple text language for input, incorporating:Arithmetic, +, -, *, /, ˆ, () abs(x)Coordinate functions, cos(x) exp(x)Continuum fields (known and unknown)Dual pairing, 〈 , 〉Matrix operations, trans(u) det(u) vec(u)Differential operators, grad(u) div(u) curl(u)We must use expression graphs for efficiency.M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 164 / 166


ExamplesWeak Form LanguagesPoisson Equation - Vector Poisson Equation- Linear Elasticity- Stokes Equation - + - + M. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 165 / 166


ReferencesConclusionsDocumentation: http://www.mcs.anl.gov/petsc/docs<strong>PETSc</strong> Users manualManual pagesMany hyperlinked examplesFAQ, Troubleshooting info, installation info, etc.Publications: http://www.mcs.anl.gov/petsc/publicationsResearch and publications that make use <strong>PETSc</strong>MPI Information: http://www.mpi-forum.orgUsing MPI (2nd Edition), by Gropp, Lusk, and SkjellumDomain Decomposition, by Smith, Bjorstad, and GroppM. Knepley (ANL) <strong>Tutorial</strong> <strong>ACTS</strong> ’06 166 / 166

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!