NERSC2010 Annual Report
National Energy ResearchScientific Computing Center2010 Annual ReportErnest Orlando Lawrence Berkeley National Laboratory1 Cyclotron Road, Berkeley, CA 94720-8148This work was supported by the Director, Office of Science, Office of AdvancedScientific Computing Research of the U.S. Department of Energy under ContractNo. DE-AC02-05CH11231.
Table of ContentsiTable of Contents1 The Year in Perspective5 Research News6 Overlooked Alternatives: New research restores photoisomerization andthermoelectrics to the roster of promising alternative energy sources12 A Goldilocks Catalyst: Calculations show why a nanocluster may be“just right” for recycling carbon dioxide by converting it to methanol16 Down with Carbon Dioxide: Detailed computational models help predictthe performance of underground carbon sequestration projects22 Touching Chaos: Simulations reveal mechanism behind a plasma instabilitythat can damage fusion reactors26 Winds of Climate Change: Cyclones in the warm Pliocene era may haveimplications for future climate32 3D Is the Key: Simulated supernova explosions happen easier and fasterwhen third dimension is added36 Surface Mineral: Innovative simulations reveal that Earth’s silicais predominantly superficial40 Soggy Origami: Nanotech research on graphene is blossoming at NERSC44 An Antimatter Hypernucleus: Discovery opens the door to new dimensionsof antimatter48 Dynameomics: A new database helps scientists discover how proteinsdo the work of life52 NISE Program Encourages Innovative Research58 NERSC Users’ Awards and Honors
ii NERSC Annual Report 2010-201161 The NERSC Center62 Hopper Powers Petascale Science63 Other New Systems and Upgrades63 Carver and Magellan Clusters63 Euclid Analytics Server and Dirac GPU Testbed64 NERSC Global Filesystem and HPSS Tape Archive Upgrades65 Innovations to Increase NERSC’s Energy Efficiency65 Carver Installation Saves Energy, Space, Water65 Monitoring Energy Across the Center to Improve Efficiency66 Improving HPC System Usability with External Services68 Providing New Computational Models68 Magellan Cloud Computing Testbed and Evaluation69 Developing Best Practices in Multicore Programming70 Computational Science and Engineering Petascale Initiative71 GPU Testbed and Evaluation72 Improving User Productivity73 Expanding the Science Gateway Infrastructure73 Speeding Up Turnaround Time for the Palomar Transient Factory74 NERSC and JGI Join Forces for Genomics Computing76 Improving VisIt Performance for AMR Applications76 Optimizing HPSS Transfers77 Uncovering Job Failures at Scale77 Improving Network Transfer Rates78 Software Support78 Maintaining User Satisfaction79 Leadership Changes80 Outreach and Training82 Research and Development by NERSC Staff89 Appendices89 Appendix A: NERSC Policy Board90 Appendix B: NERSC Client Statistics92 Appendix C: NERSC Users Group Executive Committee93 Appendix D: Office of Advanced Scientific Computing Research95 Appendix E: Advanced Scientific Computing Advisory Committee96 Appendix F: Acronyms and Abbreviations
The Year in Perspective
Table of Contents1Even while NERSC is settling into the petascale era, we are preparingfor the exascale era.For both the NERSC user community and the center staff, 2010 was trulya transformational year, with an unprecedented number of new systems andupgrades, highlighted by NERSC’s entry into petascale computing. As theprimary computing center for the Department of Energy’s Office of Science(DOE SC), our mission is to anticipate and support the computational scienceneeds—in both systems and services—of DOE SC’s six constituent offices.The science we support ranges from basic to mission-focused, from fundamentalbreakthroughs to practical applications. DOE’s science mission currentlyincludes a renewed emphasis on new energy sources, energy production andstorage, and understanding the impacts of energy use on both regional andglobal climate systems, along with mitigation of those impacts throughtechniques such as carbon sequestration. A number of major improvementsduring the past year helped NERSC provide the critical support to meetthese research needs.The most notable improvement was the two-phase installation of Hopper, ournewest supercomputer. Hopper Phase 1, a Cray XT5, was installed in October2009 and entered production use in March 2010. Even with dedicated testingand maintenance times, utilization of Hopper-1 from December 15 to March 1reached 90%, and significant scientific results are produced even during thetesting period.Hopper Phase 2, a petascale Cray XE6, was fully installed in September 2010and opened for early use and testing late in the year. NERSC users respondedin their usual fashion to this opportunity to run their codes at higher scale whileat the same time helping us ensure the machine could meet the growing demandsof our user community. Hopper’s potential was evidenced by the 1.05 petaflop/sperformance the system achieved running Linpack, making Hopper No. 5 onthe November 2010 edition of the TOP500 list of the world’s fastest computers.More importantly, the machine performs representative DOE applications evenfaster than anticipated due to the low-overhead Cray Gemini network and thefast shared memory communication within the AMD compute nodes.Hopper represents an important step in a long-term trend of massive growthin on-chip parallelism and a shift away from a pure message passing (MPI)programming model. Hopper has 24-core nodes, in contrast to the dual-core nodesin the original Franklin system, a Cray XT4 system installed only three years earlier.
2 NERSC Annual Report 2010-2011While the number of cores per node grew by an order of magnitude, the memory per node did not. As a result, it is importantto share data structures as much a possible within nodes using some kind of shared memory programming. NERSC is takinga proactive role in evaluating new models and guiding the transition of the user community. In collaboration with Cray,Berkeley Lab’s Computational Research Division, and representative users, NERSC has established a Center of Excellencefor Multicore Programming to research programming models for Hopper and beyond. MPI alone can exhaust the availablememory on Hopper, but initial research shows that a hybrid MPI + OpenMP model saves significant memory and achievesfaster runtime for some applications.Other new systems and upgrades in 2010 included installation of the Carver and Magellan clusters, the Euclid analyticsserver, and the Dirac GPU testbed; and upgrades of the NERSC Global Filesystem, the tape archive, and the power supplyto the Oakland Scientific Facility. All of these systems and upgrades are described in the NERSC Center section of thisreport. But it’s worth noting here that these installations required major efforts not only from the Systems Department staff,who planned and coordinated them, but also from the Services Department staff, who are responsible for softwareinstallation and user training. The challenge in deploying innovative, maximum-scale systems is always in making themavailable, usable, and failure tolerant. With their cross-group teamwork, resourcefulness, and commitment, NERSC’s staffexcels at meeting those challenges.In the broader computing community, interest in and use of cloud computing grew significantly. NERSC used the Magellansystem as a testbed to evaluate the viability of cloud computing to meet a portion of DOE’s scientific computing demandsfor small-scale parallel jobs. The project generated a great deal of interest in the community, judging by the number ofmedia inquiries, speaking invitations, and industry attention Magellan has generated. As part of a joint effort with ArgonneNational Laboratory and the Computational Research Division, NERSC is using Magellan to evaluate the costs, benefits,capabilities, and downsides of cloud computing. All of the cost advantages of centralization exist for scientific usersas well as businesses using commercial clouds, but science applications require different types of systems and scheduling.The LBNL research showed that even modest scale scientific workloads slow down significantly—as much as a factorof 50—on standard cloud configurations, and the slowdowns are worse with more communication-intensive problems andas the parallelism within a job scales up. Even ignoring these performance concerns, the cost of commercial clouds isprohibitive—up to 20 cents per core hour for clouds configured with high speed networking and parallel job schedulers—compared to a total NERSC cost of around 2 cents per core hour, including the cost of NERSC’s scientific consultingservices. Still, the cloud computing analysis also revealed some important advantages of the commercial setting in theability to run tailored software stacks and to obtain computing on demand with very low wait times.Magellan also captured the attention of DOE’s Joint Genome Institute (JGI), located 20 miles away in Walnut Creek,California. In March 2010, JGI became an early user of Magellan when the genomics center had a sudden need forincreased computing resources. In less than three days, NERSC and JGI staff provisioned and configured a 120-nodecluster in Magellan to match the computing environment available on JGI’s local compute clusters, where the data resides.At the same time, staff at both centers collaborated with ESnet network engineers to deploy—in just 24 hours—adedicated 9 Gbps virtual circuit between Magellan and JGI over ESnet’s Science Data Network (SDN). In April, NERSCand JGI formalized their partnership, with NERSC managing JGI’s six-person systems staff to integrate, operate, and
Research News5Research NewsWith 4,294 users in 2010 from universities, national laboratories, andindustry, NERSC supports the largest and most diverse research communityof any computing facility within the DOE complex. In their annual allocationrenewals, users reported 1,864 refereed publications (accepted or submitted)enabled at least in part by NERSC resources in 2010. These scientificaccomplishments are too numerous to cover in this report, but severalrepresentative examples are discussed in this section.Research related to energy and climate included developing a rechargeableheat battery, turning waste heat into electricity, converting carbon dioxideinto fuel, resolving uncertainties about carbon sequestration, explainingplasma instability in fusion experiments, and looking to earth’s distant pastto predict our future climate.Other research accomplishments included new insight into the ignitionof core-collapse supernovae, a surprising discovery about the compositionof Earth’s lower mantle, controlling the shape and conductivity of graphenenanostructures, exploring a new region of strange antimatter, and developingan important community resource for biological research—a databaseof protein folding pathways.Early results from a few NERSC Initiative for Scientific Exploration (NISE)projects are also included below, along with a list of awards earnedby NERSC users.
Research News7New research restores photoisomerization andthermoelectrics to the roster of promising alternativeenergy sourcesAlternative energy research might sound like a search for new ideas, butsometimes it involves putting a new spin on old ideas that have been abandoned orneglected. That’s the case with two recent research projects of Jeffrey Grossman,an associate professor in the Department of Materials Science and Engineeringat the Massachusetts Institute of Technology (MIT). Both projects involvedcollaborations with experimental groups at the University of California at Berkeley,where Grossman led a computational nanoscience research group before joiningMIT in 2009, and Lawrence Berkeley National Laboratory.“The focus of my research group at MIT is on the application and developmentof cutting-edge simulation tools to understand, predict, and design novel materialswith applications in energy conversion, energy storage, thermal transport, surfacephenomena, and synthesis,” Grossman says.His recent collaborations involved two methods of energy conversion and storagethat most people have never heard of:1. Photoisomerization, a thermochemical approach to capturing and storingthe sun’s energy which could be used to create a rechargeable heat battery.2. Thermoelectrics, the conversion of temperature differences into an electriccurrent, which could turn vast quantities of waste heat into usable energy.Project: Quantum Simulations of NanoscaleSolar CellsPI: Jeffrey Grossman, MassachusettsInstitute of TechnologySenior Investigators: Joo-Hyoung Leeand Vardha Srinivasan, MIT, and YosukeKanai, LLNLFunding: BES, NSF, LBNL, MIT, UCBComputing Resources: NERSC, Teragrid,LLNLA Rechargeable Heat BatteryBroadly speaking, there have been two approaches to capturing the sun’s energy:photovoltaics, which turn the sunlight into electricity, or solar-thermal systems,which concentrate the sun’s heat and use it to boil water to turn a turbine, or usethe heat directly for hot water or home heating. But there is another approachwhose potential was seen decades ago, but which was sidelined because nobodyfound a way to harness it in a practical and economical way.This is the thermochemical approach, in which solar energy is captured in theconfiguration of certain molecules which can then release the energy on demandto produce usable heat. And unlike conventional solar-thermal systems, whichrequire very effective insulation and even then gradually let the heat leak away,
8 NERSC Annual Report 2010-2011the heat-storing chemicals couldremain stable for years.Researchers explored this typeof solar thermal fuel in the 1970s,but there were big challenges:Nobody could find a chemical thatcould reliably and reversibly switchbetween two states, absorbingsunlight to go into one state andthen releasing heat when it revertedto the first state. Such a compoundwas finally discovered in 1996 byresearchers from the University ofCalifornia at Berkeley and LawrenceBerkeley National Laboratory; butthe compound, called fulvalenediruthenium (tetracarbonyl), includedruthenium, a rare and expensiveelement, so it was impractical forwidespread energy storage. Moreover,no one understood how fulvalenediruthenium worked, which hinderedefforts to find a cheaper variant.Recently Peter Vollhardt of UCBerkeley and Berkeley Lab’sChemical Sciences division—thediscoverer of the molecule—hasbegun construction of a workingdevice as proof of principle. However,a basic understanding of what makesthis particular molecule special wasstill missing. In order to tackle thisquestion, Vollhardt teamed up withGrossman, who did quantummechanical calculations at NERSCto find out exactly how the moleculeaccomplishes its energy storageand release.Specifically, the theorists used theQuantum ESPRESSO code to performdensity functional theory (DFT)calculations on 256 processors ofthe Franklin Cray XT4 supercomputerat NERSC, with additional computationat Lawrence LivermoreNational Laboratory.The calculations at NERSC showedthat the fulvalene dirutheniummolecule, when it absorbs sunlight,undergoes a structural transformationcalled photoisomerization, puttingthe molecule into a higher-energy orcharged state where it can remainstable indefinitely. Then, triggeredby a small addition of heat or acatalyst, it snaps back to its originalshape, releasing heat in the process(Figure 1). But the team found thatthe process is a bit more complicatedthan that.“It turns out there’s an intermediatestep that plays a major role,” saidGrossman. In this intermediate step,the molecule forms a semistableconfiguration partway between thetwo previously known states (Figure 2).“That was unexpected,” he said. Thetwo-step process helps explain whythe molecule is so stable, why theprocess is easily reversible, and alsowhy substituting other elements forruthenium has not worked so far.In effect, explained Grossman,this makes it possible to producea “rechargeable heat battery” thatcan repeatedly store and releaseheat gathered from sunlight or othersources. In principle, Grossman said,a fuel made from fulvalenediruthenium, when its stored heatis released, “can get as hot as 200Figure 1. A molecule of fulvalene diruthenium, seen in diagram, changes its configuration when it absorbs heat, and later releases heatwhen it snaps back to its original shape. This series shows the reaction pathway from the high-energy state back to the original state.Image: J. Grossman.
Research News9degrees C, plenty hot enough to heatyour home, or even to run an engineto produce electricity.”Compared to other approaches tosolar energy, he said, “it takes manyof the advantages of solar-thermalenergy, but stores the heat in the formof a fuel. It’s reversible, and it’s stableover a long term. You can use it whereyou want, on demand. You could putthe fuel in the sun, charge it up, thenuse the heat, and place the samefuel back in the sun to recharge.”EnergyIntermediate StateOriginal State(Means Two Barriers)Initial Molecule (Stored Heat)Heat ReleaseFigure 2. DFT calculations revealed an unexpected second barrier along the reaction pathwayfrom the charged state to the original state. The two-step process helps explain themolecule’s stability, and the relative barrier heights play a crucial role in its functionality.In addition to Grossman, thetheoretical work was carried out byYosuke Kanai of Lawrence LivermoreNational Laboratory and VaradharajanSrinivasan of MIT, and supported bycomplementary experiments bySteven Meier and Vollhardt of UCBerkeley. Their report on the work,which was funded in part by theNational Science Foundation, by theSustainable Products and SolutionsProgram at UC Berkeley, and by anMIT Energy Initiative seed grant,was published on October 20, 2010,in the journal Angewandte ChemieInternational Edition. 1The problem of ruthenium’s rarityand cost still remains as “a dealbreaker,”Grossman said, but nowthat the fundamental mechanismof how the molecule works isunderstood, it should be easierto find other materials that exhibitthe same behavior. This molecule“is the wrong material, but it showsit can be done,” he said.The next step, he said, is to use acombination of simulation, chemicalintuition, and databases of tensof millions of known molecules tolook for other candidates that havestructural similarities and mightexhibit the same behavior. “It’s myfirm belief that as we understandwhat makes this material tick,we’ll find that there will be othermaterials” that will work the sameway, Grossman said.Grossman plans to collaborate withDaniel Nocera, a professor of energyand chemistry at MIT, to tackle suchquestions, applying the principleslearned from this analysis in orderto design new, inexpensive materialsthat exhibit this same reversibleprocess. The tight coupling betweencomputational materials designand experimental synthesis andvalidation, he said, should furtheraccelerate the discovery of promisingnew candidate solar thermal fuels.Soon after the Angewandte Chemiepaper was published, RomanBoulatov, assistant professor ofchemistry at the University of Illinoisat Urbana-Champaign, said of thisresearch that “its greatestaccomplishment is to overcomesignificant challenges in quantumchemicalmodeling of the reaction,”thus enabling the design of new typesof molecules that could be used forenergy storage. But he added thatother challenges remain: “Two othercritical questions would have to besolved by other means, however.One, how easy is it to synthesizethe best candidates? Second, whatis a possible catalyst to trigger therelease of the stored energy?”1Y. Kanai, V. Srinivasan, S. K. Meier, K. P. C. Vollhardt, and J. C. Grossman, “Mechanism of thermal reversal of the (fulvalene)tetracarbonyldirutheniumphotoisomerization: Toward molecular solar–thermal energy storage,” Angewandte Chemie Int. Ed. 49, 8926 (2010).
10 NERSC Annual Report 2010-2011Recent work in Vollhardt’s laboratoryhas provided answers to bothquestions. Vollhardt and otherBerkeley Lab experimentalists,including Rachel Segalman andArun Majumdar (now director ofthe U.S. Department of Energy’sAdvanced Research ProjectsAgency-Energy, or ARPA-E),have successfully constructeda microscale working device offulvalene diruthenium and haveapplied for a U.S. patent. Using boththeoretical and experimental results,they are working to maximize thedevice’s efficiency and demonstratethat thermochemical energy storageis workable in a real system.Turning Waste Heatto ElectricityAutomobiles, industrial facilities,and power plants all produce wasteheat, and a lot of it. In a coal-firedpower plant, for example, as muchas 60 percent of the coal’s energydisappears into thin air. One estimateplaces the worldwide waste heatrecovery market at one trilliondollars, with the potential to offsetas much as 500 million metric tonsof carbon per year.One solution to waste heat has beenan expensive boiler-turbine system;but for many companies and utilities,it does not recover enough wasteheat to make economic sense.Bismuth telluride, a semiconductormaterial, has shown promise byemploying a thermoelectric principlecalled the Seebeck effect to convertheat into an electric current, but it’salso problematic: toxic, scarce, andexpensive, with limited efficiency.What’s the solution? Junqiao Wu,a materials scientist at UC Berkeleyand Berkeley Lab, believes theanswer lies in finding a new materialwith spectacular thermoelectricproperties that can efficiently andeconomically convert heat intoelectricity. (Thermoelectrics isthe conversion of temperaturedifferences in a solid—that is,differences in the amplitude of atomicvibration—into an electric current.)Computations performed onNERSC’s Cray XT4, Franklin, byWu’s collaborators Joo-Hyoung Leeand Grossman of MIT, showed thatthe introduction of oxygen impuritiesinto a unique class of semiconductorsknown as highly mismatched alloys(HMAs) can substantially enhancethe thermoelectric performance ofthese materials without the usualdegradation in electric conductivity.The team published a paper onthese results in the journalPhysical Review Letters. 2a.0.045000.035750.026500.017250.008000Max: 0.04996Min: 0.000“We are predicting a range ofinexpensive, abundant, non-toxicmaterials in which the band structurecan be widely tuned for maximalthermoelectric efficiency,” says Wu.“Specifically, we’ve shown that thehybridization of electronic wavefunctions of alloy constituents inHMAs makes it possible to enhancethermopower without much reductionof electric conductivity, whichis not the case for conventionalthermoelectric materials.”In 1821, the physicist Thomas JohannSeebeck observed that a temperaturedifference between two ends of ametal bar created an electrical currentin between, with the voltage beingdirectly proportional to the temperaturedifference. This phenomenonbecame known as the Seebeckthermoelectric effect, and it holdsgreat promise for capturing andconverting into electricity some ofthe vast amounts of heat now beinglost in the turbine-driven production0.070000.055000.040000.025000.01000Max: 0.07756Min: 0.000Figure 3. Contour plots showing electronic density of states in HMAs created from zincselenide by the addition of (a) 3.125-percent oxygen atoms, and (b) 6.25 percentoxygen. The zinc and selenium atoms are shown in light blue and orange. Oxygen atoms(dark blue) are surrounded by high electronic density regions. Image: J.-H. Lee, J. Wu,and J. Grossman.b.2Joo-Hyoung Lee, Junqiao Wu, and Jeffrey C. Grossman, “Enhancing the thermoelectric power factor with highly mismatched isoelectronic doping,” PhysicalReview Letters 104, 016602 (2010).
Research News11DOS (states/eV-cell)0.50.40.30.20.1O sO pO dZn sZn pZn dthe semiconductor zinc selenide,they simulated the introduction oftwo dilute concentrations of oxygenatoms (3.125 and 6.25 percentrespectively) to create model HMAs(Figure 3). In both cases, the oxygenimpurities were shown to inducepeaks in the electronic density ofstates above the conduction bandminimum. It was also shown thatcharge densities near the densityof state peaks were substantiallyattracted toward the highly electronegativeoxygen atoms (Figure 4).0-1 0 1 2 3E - E F(eV)Figure 4. The projected density of states (DOS) of ZnSe 1–xOxwith x = 3:125%. Image: J.-H.Lee, J. Wu, and J. Grossman.of electrical power. For this lostheat to be reclaimed, however,thermoelectric efficiency mustbe significantly improved.“Good thermoelectric materialsshould have high thermopower,high electric conductivity, andlow thermal conductivity,” says Wu.“Enhancement in thermoelectricperformance can be achieved byreducing thermal conductivitythrough nanostructuring. However,increasing performance by increasingthermopower has proven difficultbecause an increase in thermopowerhas typically come at the cost ofa decrease in electric conductivity.”Wu knew from earlier research that agood thermoelectric material neededto have a certain type of density ofstates, which is a mathematicaldescription of distribution of energylevels within a semiconductor. “If thedensity of states is flat or gradual,the material won’t have very goodthermoelectric properties,” Wu says.“You need it to be spiky or peaky.”Wu also knew that a specializedtype of semiconductor called highlymismatched alloys (HMAs) couldbe very peaky because of theirhybridization, the result of forcingtogether two materials that don’t wantto mix atomically, akin to mixingwater and oil. HMAs are formed fromalloys that are highly mismatchedin terms of electronegativity, whichis a measurement of their ability toattract electrons. Their propertiescan be dramatically altered withonly a small amount of doping.Wu hypothesized that mixing twosemiconductors, zinc selenide andzinc oxide, into an HMA, wouldproduce a peaky density of states.In their theoretical work, Lee,Wu, and Grossman discovered thatthis type of electronic structureengineering can be greatly beneficialfor thermoelectricity. Working withThe researchers found that foreach of the simulation scenarios,the impurity-induced peaks in theelectronic density of states resultedin a sharp increase of both thermopowerand electric conductivitycompared to oxygen-free zincselenide. The increases were byfactors of 30 and 180 respectively.“Furthermore, this effect is foundto be absent when the impurityelectronegativity matches the hostthat it substitutes,” Wu says.“These results suggest that highlyelectronegativity-mismatched alloyscan be designed for high performancethermoelectric applications.” Wu andhis research group are now workingto actually synthesize HMAs forphysical testing in the laboratory.In addition to capturing energythat is now being wasted, Wubelieves that HMA-basedthermoelectrics can also be usedfor solid state cooling, in whicha thermoelectric device is usedto cool other devices or materials.“Thermoelectric coolers haveadvantages over conventionalrefrigeration technology in that theyhave no moving parts, need littlemaintenance, and work at a muchsmaller spatial scale,” Wu says.
A Goldilocks Catalyst
Research News13Calculations show why a nanocluster may be “just right”for recycling carbon dioxide by converting it to methanolCarbon dioxide (CO 2) emissions from fossil fuel combustion are majorcontributors to global warming. Since CO 2comes from fuel, why can’t we recycleit back into fuel rather than releasing it into the atmosphere? An economical wayto synthesize liquid fuel from CO 2would help mitigate global warming whilereducing our need to burn fossil fuels. Catalysts that induce chemical reactionscould be the key to recycling carbon dioxide, but they need to have just the rightactivity level—too much, and the wrong chemicals are produced; too little, andthey don’t do their job at all.Recycling CO 2into alcohols such as methanol (CH 3OH) or ethanol (C 2H 5OH)by adding hydrogen would be sustainable if the hydrogen were produced bya renewable technology such as solar-thermal water splitting, which separatesthe hydrogen and oxygen in water. But synthesizing fuel by CO 2hydrogenationrequires a catalyst, and the most successful catalysts so far have been rhodiumcompounds. With a spot price averaging more than $4000 an ounce over thelast five years, rhodium is too expensive to be practical on a large scale.Molybdenum (“moly”) sulfide is a more affordable catalyst, and experimentshave identified it as a possible candidate for CO 2hydrogenation; but bulk molydisulfide (MoS 2) only produces methane. Other metals can be added to thecatalyst to produce alcohols, but even then, the output is too low to make theprocess commercially viable.Project: Computational Studies ofSupported Nanoparticles inHeterogeneous CatalysisPI: Ping Liu, Brookhaven NationalLaboratorySenior Investigators: YongMan Choi, BNLFunding: BES, BNLComputing Resources: NERSC, BNLNanoparticles, however, often have higher catalytic activity than bulkchemicals. Designing moly sulfide nanoparticles optimized for alcoholsynthesis would depend on finding answers to basic questions about thechemical reaction mechanisms.Many of those questions were answered in a paper featured on the cover of theMarch 25, 2010 special issue of the Journal of Physical Chemistry A devotedto “Green Chemistry in Energy Production” (Figure 5). In this paper, BrookhavenNational Laboratory (BNL) chemist Ping Liu, with collaborators YongMan Choi,Yixiong Yang, and Michael G. White from BNL and the State University of NewYork (SUNY) Stony Brook, used density functional theory (DFT) calculationsconducted at BNL and NERSC to investigate methanol synthesis using a molysulfide nanocluster (Mo 6S 8) as the catalyst instead of the bulk moly sulfidepowder used in previous experiments. 3
14 NERSC Annual Report 2010-2011“This study might seem like pureresearch, but it has significantimplications for real catalysis,” Liusays. “If you can recycle CO 2andproduce some useful fuels, thatwould be very significant. And that’swhat DOE has always asked peopleto do, to find some new alternativesfor future fuel that will eliminatethe environmental footprint.”The reason bulk moly disulfideonly produces methane (CH 4) is,ironically, that it is too active acatalyst. “Catalysts are very trickymaterials,” Liu says. “You don’twant it too active, which will breakthe CO bond and only form methane,an undesired product for liquid fuelproduction. If the catalyst is tooinert, it can’t dissociate CO 2or H 2,and do the trick. So a good catalystFigure 5. The top left, bottom left, andtop right images illustrate aspects ofmethanol synthesis on a moly sulfidenanocluster. (The other two images arefrom different studies.) Image: © 2010American Chemical Society.Designing moly sulfide nanoparticles optimizedfor alcohol synthesis depends on finding answersto basic questions about the chemical reactionmechanisms.should have the right activity to dothis catalysis.”From DFT calculations run usingthe VASP code on 16 to 32 coresof NERSC’s Franklin system, Liuand colleagues discovered that theMo 6S 8nanocluster has catalyticproperties very different from bulkMoS 2. This is not surprising, giventhe different geometries of themolecules: the cluster has a morecomplex, cagelike structure (Figure6). The cluster is not as reactiveas the bulk—it cannot break thecarbon–oxygen bond in CO 2—andtherefore it cannot produce methane.But, using a completely differentreaction pathway (Figure 7), thecluster does have the ability toproduce methanol.“In this case, the nanocluster isnot as active as the bulk in termsof bonding; but it has higher activity,or you could say higher selectivity,to synthesize methanol from CO 2and H 2,” Liu explains.Figure 7 shows the details of thereaction pathway from carbondioxide and hydrogen to methanoland water, as catalyzed by the Mo 6S 8cluster, including the intermediatestates, energetics, and geometriesof the transition states—all of whichwere revealed for the first time bythis research.This detailed understandingis important for optimizing thecatalyst’s efficiency. For example,lowering the energy barrier in theCO to HCO reaction would allowincreased production of methanol.Adding a C–C bond could resultin synthesis of ethanol, which isusable in many existing engines.Both of these improvements mightbe achieved by adding another metalto the cluster, a possibility thatLiu and her theorist colleagues arecurrently working on in collaborationwith Mike White, an experimentalist.The goal of catalyst research is to finda “Goldilocks” catalyst—one thatdoes neither too much nor too little,but is “just right” for a particularapplication. By revealing why aspecific catalyst enables a specificreaction, studies like this one providethe basic understanding needed todesign the catalysts we need forsustainable energy production.S-edgeMo-edgea. b.Figure 6. Optimized edge surface of (a)MoS 2and (b) Mo 6S 8cluster (Mo = cyan,S = yellow). Image: Liu et al., 2010.3Ping Liu, YongMan Choi, Yixiong Yang, and Michael G. White, “Methanol synthesis from H 2and CO 2on a Mo 6S 8cluster: A density functional study,” J. Phys.Chem. A 114, 3888 (2010).
Research News151.00.5CO 2+ 3H 0.02-0.5CO 3OH + H 2 O-1.0-1.5*CO 2CO 2+3H 2Energy (eV)*HOCO*CO+*H 2O*CO*HCO*H 2CO*H 3CO*H 3COHH 3COHFigure 7. Optimized potential energy diagram for methanol synthesis from CO 2and H 2on a Mo 6S 8cluster. Thin bars represent thereactants, products, and intermediates; thick bars stand for the transition states. The geometries of the transition states involved in thereaction are also included (Mo = cyan, S = yellow, C = gray, O = red, H = white). Image: Liu et al., 2010.
Down with CarbonDioxide
Research News17Detailed computational models help predictthe performance of underground carbonsequestration projectsDespite progress in clean energy, Americans will continue to rely on fossilfuels for years to come. In fact, coal-, oil- and natural gas-fired power plantswill generate 69 percent of U.S. electricity as late as 2035, according to theU.S. Energy Information Administration.Such sobering projections have sparked a wide range of proposals for keepingthe resulting carbon dioxide (CO 2) out of the atmosphere, where it traps heatand contributes to global warming. Berkeley Lab scientists are using computersimulations to evaluate one promising idea: Pump it into saltwater reservoirsdeep underground.Underground, or geologic, carbon sequestration “will be key tool in reducingatmospheric CO 2,” says George Pau, a Luis W. Alvarez Postdoctoral Fellow withBerkeley Lab’s Center for Computational Sciences and Engineering (CCSE).“By providing better characterizations of the processes involved, we can moreaccurately predict the performance of carbon sequestration projects, includingthe storage capacity and long-term security of a potential site.”Taking advantage of NERSC’s massively parallel computing capacity, CCSEresearchers led by John Bell have created the most detailed models yetof the mixing processes that occur at the interface of the sequestered carbondioxide and brine. 4 These simulations—including the first-ever threedimensionalones—will help scientists better predict the success of thiskind of sequestration project.Project: Simulation and Analysis ofReacting FlowsPI: John B. Bell, Lawrence BerkeleyNational LaboratorySenior Investigators: Ann S. Almgren,Marcus S. Day, LBNLFunding: ASCR, BESComputing Resources: NERSC, LBNLSelecting a SiteCarbon sequestration ideas run a wide gamut, from mixing CO 2into cementto pumping it into the deepest ocean waters. However, geologic sequestrationprojects are promising enough that some are already under way (Figure 8).In this type of sequestration, CO 2from natural gas-, oil-, or coal-fired power4G. S. H. Pau, J. B. Bell, K. Pruess, A. S. Almgren, M. J. Lijewski, and K. Zhang, “High resolutionsimulation and characterization of density-driven flow in CO 2storage in saline aquifers,” Advancesin Water Resources 33, 443 (2010).
18 NERSC Annual Report 2010-2011plants is pressure-injected deepinto stable geologic formations.“The natural first picks [for thesesites] are depleted oil and gasreservoirs,” says Karsten Pruess,a hydrologist with Berkeley Lab’sEarth Sciences Division and aco-author of the study.However, the problem with this“put-it-back-where-it-comes-from”notion is that combustion generatesabout three times more CO 2, byvolume, than the fossil fuels burned.“So, even if all these old reservoirswere good targets—and not all are—the capacity available isn’t nearlyenough,” Pruess concludes.Saline aquifers offer a promisingalternative. The brine that permeatesthese porous rock formations isunusable for crops or drinking, butgiven enough time, it can lock awaysome CO 2in the form of a weak acid.Meanwhile, the solid-rock lid thatprevents saline from welling up tothe surface (the caprock) also trapsinjected CO 2underground.Geological Storage Options for CO 21 Depleted oil and gas reservoirs2 Use of CO 2in enhanced oil recovery3 Deep unused saline water-saturated reservoir rocks4 Deep unmineable coal seams5 Use of CO 2in enhanced coal bed methane recovery6 Other suggested options (basalts, oil shales, cavities)32According to Pruess, when CO 2isfirst injected into an aquifer, it formsa separate layer, a bubble of sorts,above the saline. Over time, someCO 2will diffuse into the brine,causing the brine density toincrease slightly. As more of theCO 2diffuses into the water, theresulting layer of carbonic acid (nowless buoyant than the water below it)sinks. This churns up fresh brine,and a convective process kicks in1kmFigure 8. Geologic sequestration in salineaquifers (3) is shown in this illustrationalongside other geologic sequestrationideas. Image: Australian CooperativeResearch Centre for Greenhouse GasTechnologies (CO2CRC).2km
Research News19Produced oil or gasInjected CO 2Stored CO 21456©CO2CRC
20 NERSC Annual Report 2010-2011which dramatically increases the rateof CO 2uptake, compared to diffusionalone (Figures 9 and 10). Thisprocess can take hundreds, eventhousands of years, but scientistshaven’t fully quantified the factorsthat influence and trigger convection.“The processes by which CO 2dissolves into the brine are of seriousconsequence,” says Pau. “If youcan determine the rate this willoccur, then you can more accuratelydetermine how much CO 2can bestored in a given aquifer.”Because scientists can’t waithundreds of years for answers, theyare using NERSC supercomputersto run mathematical models thatsimulate the processes involved.Geological models that break theproblem into fixed-size pieces cannotreveal phenomena needed tounderstand sequestration becausethey have either insufficient detailor require too much computationto be practical. “About 100 metersresolution is the best we can do,”Pruess says. Because convectioneffects start at scales smaller thana millimeter, they are typicallyinvisible to these geological models.Cracking theSequestration CodeThat’s why Berkeley Lab’s CCSEand Pruess collaborated to developa new simulation code. The codecombines a computing techniquecalled adaptive mesh refinement(AMR) with high performanceparallel computing to create highresolutionsimulations. The team’ssimulations were performed using2 million processor-hours andrunning on up to 2,048 cores3D simulations of carbon sequestration area first step toward predicting how much CO 2an aquifer can hold.simultaneously on NERSC’s CrayXT4 system, Franklin.Called Porous Media AMR, orPMAMR, the code concentratescomputing power on more activeareas of a simulation by breakingthem into finer parts (higherresolution) while calculating lessactive, less critical portions incoarser parts (lower resolution).The size of the chunks, and thusthe resolution, automatically shiftsas the model changes, ensuring thatthe most active, critical processesare also the most highly resolved.This ability to automatically shiftfocus makes AMR a powerful tool formodeling phenomena that start smalland grow over time, says Pruess.The first PMAMR models were ofmodest size, on the scale of meters.But the eventual goal is to be ableto use the characteristics of anaquifer—such as the size of thepores in the sandstone, the amountof salts in the water, or thecomposition of the reservoir rocks—to model a given site and predicthow much CO 2it can accommodate.Figure 9. As CO 2diffuses into the brine, the top, more concentrated layers (indicated in red) begin to sink, triggering a convective processthat churns up fresh brine and speeds the salt water’s CO 2uptake. The arrows in the frame at the right indicate the direction of flow.Understanding convection onset is critical to understanding how much CO 2a given aquifer can store. Image: George Pau.
Research News21PMAMR simulations might alsobe used to help geologists moreaccurately track and predict themigration of hazardous wastesunderground or to recover moreoil from existing wells. And theteam’s three-dimensionalsimulations could help scientistsuncover details not apparent intwo-dimensional models.The next challenge the team facesinvolves extending and integratingtheir models with large-scale models.“If you look how much CO 2is madeby a large coal-fired power plant,and you want to inject all thatunderground for 30 years, youend up with a CO 2plume on theorder of 10 km in dimension,” saysPruess. This free CO 2will continueto migrate outward from theinjection site and, if the aquiferhas a slope, could eventually reachdrinking water aquifers or escapeat outcrops hundreds of kilometersaway. “We are very interested inthe long-term fate of this thing,”says Pruess.ContourVar: CO232.9616.48Max: 49.44Min: 0.000VectorVar:_velocity3.602e-082.701e-081.801e-089.004e-092.475e-16Max: 3.602e-08Min: 2.475e-16ContourVar: CO232.9516.48Max: 49.43Min: 0.000ContourVar: CO232.9816.49Max: 49.48Min: 0.000VectorVar:_velocity3.489e-082.617e-081.745e-088.726e-096.028e-12Max: 3.489e-08Min: 6.028e-12VectorVar:_velocity3.148e-082.361e-081.574e-087.870e-097.459e-15Max: 3.148e-08Min: 7.459e-15Figure 10. These three-dimensional simulation snapshots show the interface between sequestered CO 2and brine over time. The sequence,starting at the upper left frame and proceeding counterclockwise, shows a dramatic evolution in the size of the convection cells with time.Image: George Pau.
Research News23Simulations reveal mechanism behind a plasma instabilitythat can damage fusion reactorsIf humans could harness nuclear fusion, the process that powers stars like ourSun, the world could have an inexhaustible energy source. In theory, scientistscould produce a steady stream of fusion energy on Earth by heating up two typesof hydrogen atoms—deuterium and tritium—to more than 100 million degreescentigrade until they become a gaseous stew of electrically charged particles,called plasma; then use powerful magnets to confine these particles until theyfuse together, releasing energy in the process.Although researchers have achieved magnetic fusion in experiments, they stilldo not understand the behavior of plasma well enough to generate a sustainableflow of energy. They hope the international ITER experiment, currently underconstruction in southern France, will provide the necessary insights to makefusion an economically viable alternative energy source. ITER, an industrial-scalereactor, will attempt to create 500 megawatts of fusion power for several-minutestretches. Much current research is focused on answering fundamental questionsthat will allow engineers to optimize ITER’s design.Project: 3D Extended MHD Simulationof Fusion PlasmasPI: Stephen Jardin, Princeton PlasmaPhysics LaboratorySenior Investigators: Joshua Breslau, JinChen, Guoyong Fu, Wonchull Park, PPPL;Henry R. Strauss, New York University;Linda Sugiyama, Massachusetts Instituteof TechnologyFunding: FES, SciDACComputing Resources: NERSCThe DOE’s Scientific Discovery through Advanced Computing (SciDAC) Centerfor Extended Magnetohydrodynamic (MHD) Modeling, led by Steve Jardinof the Princeton Plasma Physics Laboratory, has played a vital role in developingcomputer codes for fusion modeling. Recently, the collaboration created anextension to the Multilevel 3D (M3D) computer code that allows researchersto simulate what happens when charged particles are ejected from a hot plasmastew and splatter on the walls of a tokamak device, the doughnut-shaped “pot”used to magnetically contain plasmas. According to Linda Sugiyama of theMassachusetts Institute of Technology, who led the code extension development,these simulations are vital for ensuring the safety and success of upcomingmagnetic confinement fusion experiments like ITER.Modeling Fusion to Ensure SafetyBecause hot plasma is extremely dangerous, it is imperative that researchersunderstand plasma instability so that they can properly confine it. For fusionto occur, charged particles must be heated to more than 100 million degreescentigrade. At these searing temperatures, the material is too hot to containwith most earthly materials. In tokamaks, the plasma is shaped into a torus,
24 NERSC Annual Report 2010-2011or doughnut shape, and held in placewithin a vacuum by powerfulmagnetic fields, so that the plasmadoes not touch the surface of thecontainment vessel.Tokamak plasmas typically have oneor two magnetic “X-points”—socalled because the magnetic fieldlines form an X shape—at the top orbottom of the plasma surface.X-points help promote efficientfusion: by channeling stray particlesalong the magnetic field lines todivertors designed to handle them,the X-points help control the plasmashape and prevent instabilities.Until they don’t.For three decades, tokamakresearchers have observed periodicdisruptive instabilities, called edgelocalized modes (ELMs), on theedge of the plasma. ELMs allowblobs of plasma to break out of theFigure 11. This ELM research wasfeatured on the cover of the June 2010issue of Physics of Plasmas. Image:©2010, American Institute of Physics.magnetic confinement through theprotective vacuum and splatter ontothe tokamak wall before the plasmastabilizes and resumes its originalshape. The number of unstablepulses affects how much plasmais thrown onto the walls, whichin turn determines the extentof wall damage.The origins and dynamicsof ELMs—or of stable plasmaedges, for that matter—have longresisted theoretical explanation.A breakthrough came whenSugiyama and a colleague, HenryStrauss of the Courant Institute ofMathematical Sciences at New YorkUniversity, used the M3D extensionon NERSC computers to simulatethe development of the magneticfields on a tokamak plasma surface.They reported their results in anarticle featured on the cover of theJune 2010 issue of Physics ofPlasmas (Figure 11). 5“Studies of nonlinear dynamics showthat if you have X-points on theplasma boundary, which is standardfor high temperature fusion plasmas,particles can fly out and hit the walls.But no one had ever seen whatactually happens inside the plasma ifyou let it go,” says Sugiyama. “Thisis the first time that we have seen astochastic magnetic tangle, astructure well known fromHamiltonian chaos theory, generatedby the plasma itself. Its existencealso means that we will have torethink some of our basic ideas aboutconfined plasmas.”What Sugiyama and Straussdiscovered is that ELMs constitutea new class of nonlinear plasmainstability involving the mutualinfluence of growing plasmastructures and the magnetic field.A minor instability anywhere on aplasma boundary that possesses anX-point can perturb the magneticfield, causing it to become chaotic.As the plasma instability grows,magnetic chaos increases, allowingthe plasma bulges to penetrate deepinto the plasma and also to break outof the boundary surface, resulting ina loss of plasma (Figures 12 and 13).This demonstration that the plasmamotion can couple to a chaoticmagnetic field may help explainthe large range of ELM and ELM-freeedge behaviors in fusion plasmas,as well as clarifying some propertiesof magnetic confinement.Sugiyama notes that the primarycauses of plasma instability varyin the different plasma regionsof the tokamak, including the core,edge, and surrounding vacuum.To understand the most dangerousof these instabilities, computationalphysicists model each plasma regionon short time and space scales.The plasma regions are stronglycoupled through the magnetic andelectric fields, and researchers mustcarry out an integrated simulationthat shows how all the componentsinteract in the big picture, overlong time scales.“Modeling plasma instabilities iscomputationally challenging becausethese processes occur on widelydiffering scales and have uniquecomplexities,” says Sugiyama. “Thisphenomenon couldn’t be investigatedwithout computers. Comparedto other research areas, thesesimulations are not computationally5L. E. Sugiyama and H. R. Strauss, “Magnetic X-points, edge localized modes, and stochasticity,” Physics of Plasmas 17, 062505 (2010).
Research News25large. The beauty of using NERSC isthat I can run my medium-size jobsfor a long time until I generate all thetime steps I need to see the entireprocess accurately.”She notes that the longest job ranon 360 processors for 300 hours onNERSC’s Cray XT4 Franklin system.However, she also ran numerousother jobs on the facility’s Franklinand DaVinci systems using anywherefrom 432 to 768 processors forabout 200 CPU hours.“I greatly appreciate the NERSCpolicy of supporting the workrequired to scale up from small jobsto large jobs, such as generatinginput files and visualizing theresults,” says Sugiyama. “Thecenter’s user consultants andanalytics group were crucial togetting these results.”Her ongoing research has two majorgoals. One is to verify the simulationmodel against experimentalobservations. “This is provingsurprisingly difficult,” Sugiyamasays, “primarily because theinstability is extremely fast comparedto most experimental measurements—changes occur over a few microseconds—andit is very difficult tomeasure most things occurring insidea hot dense plasma, since materialprobes are destroyed immediately.We have good contacts with all themajor U.S. experiments and someinternational ones, but have beenwaiting for good data.”Her second goal is to characterizethe ELM threshold in a predictivefashion for future fusion plasmas,including new experiments such asITER. This involves studying smallELMs and ELM-stable plasmas,where there are other, smaller MHDinstabilities. Demonstrating thepredictive value of this ELM model,besides being a significanttheoretical achievement, will helpbring practical fusion energy onestep closer to realization.Figure 12. Temperature surface near the plasma edge shows helical perturbation alignedwith the magnetic field. Image: L. E. Sugiyama and the M3D Team.“This is the first timethat we have seen astochastic magnetictangle … generatedby the plasma itself.”a. b. c. d. e.Figure 13. This image shows a simulationof the initial stage of a large edgein stability in the DIII-D tokamak. Thetop row shows contours of the plasmatemperature in a cross section of thetorus with its central axis to the left.The bottom row shows correspondingdensity. The vacuum region betweenplasma and wall is grey; plasma expandsrapidly into this region and hits the outerwall, then gradually subsides back to itsoriginal shape. Image: L. E. Sugiyamaand H. R. Strauss.
Winds of ClimateChange
Research News27Cyclones in the warm Pliocene era may have implicationsfor future climateScientists searching for clues to Earth’s future climate are turning to its dimpast, the Pliocene epoch. “We’re interested in the Pliocene because it’s theclosest analog we have in the past for what could happen in our future,” saysChris Brierley, a Yale climate researcher computing at NERSC.Brierley and Yale colleague Alexey Fedorov are using simulations to help unravela mystery that has bedeviled climatologists for years: Why was the Pliocene—under conditions similar to today—so much hotter? Their most recent worksuggests that tropical cyclones (also called hurricanes or typhoons) may haveplayed a crucial role. 6Working with Kerry Emanuel of MIT, the Yale researchers’ computer simulationsrevealed twice as many tropical cyclones during the Pliocene; storms that lastedtwo to three days longer on average (compared to today); and, unlike today,cyclones occurred across the entire tropical Pacific Ocean (Figure 14). Thiswide-ranging distribution of storms disrupted a key ocean circulation pattern,warming seas and creating atmospheric conditions that encouraged more cyclones.This positive feedback loop, they posit, is key to explaining the Pliocene’s highertemperatures and persistent El Niño-like conditions. Their findings, featured onthe cover of the journal Nature (Figure 15), could have implications for theplanet’s future as global temperatures continue to rise due to climate change.What’s the Pliocene Got to Do with It?Project: Tropical Cyclones and theShallow Pacific Overturning CirculationPI: Alexey Fedorov, Yale UniversitySenior Investigator: Christopher Brierley,Yale UniversityThe Pliocene Epoch (the period between 5 and 3 million years ago) is especiallyinteresting to climatologists because conditions then were similar to thoseof today’s Earth, including the size and position of land masses, the levelsof sunlight, and, crucially, atmospheric concentrations of the greenhousegas carbon dioxide (CO 2). “In geological terms, it was basically the same,”Brierley says.Funding: BER, NSF, David and LucilePackard FoundationComputing Resources: NERSC6Alexey V. Fedorov, Christopher M. Brierley, and Kerry Emanuel, “Tropical cyclones and permanentEl Niño in the early Pliocene epoch,” Nature 463, 1066 (2010).
28 NERSC Annual Report 2010-2011worldwide, are characterized by anuptick in the temperature of the coldtongue. It only took about 50 modelyearsfor the cold tongue to beginwarming in the researchers’ Pliocenemodel. After 200 years, El Niño-likewarming was persistent (Figure 17).Figure 14. Simulated hurricane tracks under modern conditions (left) and Pliocene (right)overlay sea surface temperatures created by a fully coupled climate model. Images: ChrisM. Brierley.Yet the Pliocene was puzzlinglyhotter. Global mean temperaturesranged up to 4º C (about 8º F) higherthan today; sea levels were 25meters higher; and sea ice was allbut absent. The poles heated up,yet temperatures in the tropicsdid not exceed modern-day levels.And the planet was locked intoa continuous El Niño-like state,as opposed to the on/off oscillationexperienced today.Fedorov and Brierley suspectedcyclones played a role. “We knewfrom a previous study that additionalvertical mixing [in the ocean] cancause warming in the Niño region”of the eastern tropical Pacific, saysBrierley. “Hurricanes were one of thefew sources of mixing that we couldenvisage changing,” he explains.Cyclones Stir the PotThey then randomly seeded vorticesand tracked the results. “We foundthere were twice as many tropicalcyclones during this period [versusmodern times], and that they lastedlonger,” two to three days longer onaverage, says Brierley.Hurricane paths shifted, too (Figure16). “Rather than starting in the farwestern Pacific, hitting Japan, thenheading back out into the northernPacific, they were all over the place,”striking all along two bands north andsouth of the equator, says Brierley.The next step was to integrate theextra ocean-churning action of allthose hurricanes into a more completeclimate model. To do that, the teamapplied an idealized pattern ofhurricane ocean-mixing to a fullycoupled climate model, one thatincludes ocean, sea ice, land cover,and atmosphere.Brierley, Fedorov, and Emanuelsuggest tropical cyclones wereto blame, at least in part. Morenumerous and more widelydistributed across the globe,the cyclones disrupted the coldtongue, raising ocean temperaturesand leading to changes in theatmosphere that favored theformation of even more tropicalstorms. The cycle repeated, helpingto sustain the Pliocene’s higheroverall temperatures.“We’re suggesting that one of thereasons for the differences [betweenthe Pliocene and today] is thisfeedback loop that isn’t operating inour present climate,” says Brierley.To test their idea, the team firstsimulated atmospheric flows formodern Earth and for the Pliocene(see sidebar). Because most otherfactors were basically the same,the main difference between thetwo was the Pliocene’s higher seasurface temperatures.Today, cold water originating off thecoasts of California and Chile skirtsthe region of tropical cyclone activityon its way to the Equator, where itsurfaces in a “cold tongue” thatstretches west off the coast of SouthAmerica. El Niño events, which affectprecipitation and temperature patternsFigure 15. These simulations werefeatured on the cover of the 25 February2010 issue of Nature. Image ©2010Macmillan Publishers Limited.
Research News29a.Tracks in modern climateLatitude (degrees) Latitude (degrees)b.30N030S30E 60E 90E 120E 150E 180 150W 120W 90W 60W 30WTracks in early Pliocene climate30N030S0HurricaneStrength54321TSTD30E 60E 90E 120E 150E 180 150W 120W 90W 60W 30W 0Longitude (degrees)Figure 16. Simulated hurricane tracks in (a) modern climate and (b) Pliocene climate illustrate a two-year subsample from 10,000simulated tropical cyclones. Colors indicate strength from tropical depression (blue) to category-5 hurricane (red). Images: Chris M. Brierley.Powering Climate ModelsCalculating the complex, multifaceted interactions of land, sea, atmosphere, and ice requires specialized and massive computingpower and storage. NERSC’s Cray XT4, Franklin, and the NERSC High Performance Storage System (HPSS) both played importantroles in Fedorov and Brierley’s enquiries.For this investigation, the team used 125,000 processor hours on Franklin to run fully coupled climate models. “The currentresources that we have locally achieve a rate of about three model years per day, whereas Franklin can achieve around 25 yearsper day,” says Brierley. Models run on Franklin in just a month would have taken six months or more locally. “It’s not just thenumber and speed of individual processors, but the speed at which they can pass information” that makes it feasible to runhundreds of years long simulations, Brierley says.The team was also able to get their models up and running quickly because of Franklin’s pre-optimized, pre-installed climatemodeling code. “I only had to set up the problem to do the science, rather than dealing with all the trouble of getting the coderunning first,” says Brierley.Because of NERSC’s mass storage facility, HPSS, “I was able to store the vast amounts of data generated and then only take thefields I need at the time to a local machine to run visualization software,” Brierley says. Safely stored, the datasets can be usedagain in future rather than forcing scientists to repeat computationally expensive runs.
30 NERSC Annual Report 2010-2011Will History Repeat?The researchers don’t know ifmodern-day climate will eventuallydevelop the same feedback loop,because they aren’t sure what set ifoff the first place. “There must havebeen a trigger, some sort of tippingpoint, but we don’t yet know what thatmay have been,” Brierley explains.a.Latitude (degrees)Preindustrial20N020S120E 150E 180 150W 120W 90WSST(˚C)312927252321Fedorov cautions against drawingdirect comparisons between ourfuture climate and this past epoch.The current scientific consensusprojects that the number of intensehurricanes will increase under globalwarming, but the overall numberwill actually fall, he says. “However,unless we understand the causesof these differences, we will notbe sure whether our projectionsare correct,” explains Fedorov.“Changes in the frequency anddistribution of these storms couldbe a significant component of futureclimate conditions.”For their next step, the team isrefining and rerunning their modelsat NERSC. “We’ve done all thisassuming that you can average thebehavior of hurricanes into oneannual mean signal,” says Brierley.The team is working on more realistichurricane mixing, “where you havea very strong forcing for a shortamount of time followed by a longquiescence,” says Brierley. “We needto know if this makes a significantdifference in our models.”b.Latitude (degrees)c.Latitude (degrees)20N020S20N020SIncreasing CO 2and subtropical mixing120E 150E 180 150W 120W 90WIncreasing CO 2only120E 150E 180 150W 120W 90WLongitude (degrees)SSTchange(˚C)Figure 17. Sea surface temperature changes in the tropical Pacific as simulated by thecoupled model are shown here. In the top frame is a baseline of temperatures usingpre-industrial CO 2levels. The center frame shows the amount of warming (over baseline)under Pliocene CO 2levels with the water-churning effects of intense hurricane activity.The bottom frame shows the effects of higher CO 2levels alone. After 200 years ofcalculations, the Pliocene scenario (center) shows warming in the eastern equatorialPacific is high, the signature of El Niño-like conditions. Images: Chris M. Brierley.3210-1-2-3SSTchange(˚C)3210-1-2-3Tropical cyclones disrupted the “cold tongue” in the Pacific Ocean, raisingocean temperatures and leading to changes in the atmosphere that favoredthe formation of even more tropical storms.
3D Is the Key
Research News33Simulated supernova explosions happen easier andfaster when third dimension is addedFor scientists, supernovae are true superstars—massive explosions of huge,dying stars that shine light on the shape and fate of the universe. Recordedobservations of supernovae stretch back thousands of years, but only in thepast 50 years have researchers been able to attempt to understand what’s reallyhappening inside a supernova via computer modeling. These simulations, evenwhen done in only one or two dimensions, can lead to new information and helpaddress longstanding problems in astrophysics.Now researchers have found a new way to make three-dimensional computersimulations of “core collapse” supernova explosions (one of two major types).Writing in the Sept. 1, 2010 issue of the Astrophysical Journal, Jason Nordhausand Adam Burrows of Princeton University, and Ann Almgren and John Bellof Lawrence Berkeley National Laboratory report that the new simulations arebeginning to match the massive blowouts astronomers have witnessed whengigantic stars die. 7Project: SciDAC ComputationalAstrophysics ConsortiumPI: Stan Woosley, University of California,Santa CruzSenior Investigators: Adam Burrows,Princeton University; Peter Nugent,John Bell, and Ann Almgren, LawrenceBerkeley National Laboratory; MichaelZingale, State University of New York,Stony Brook; Louis Howell and RobertHoffman, Lawrence Livermore NationalLaboratory; Alex Heger, University ofMinnesota; Daniel Kasen, Universityof California, Santa CruzThe new simulations are based on the idea that the collapsing star is notspherical but distinctly asymmetrical and affected by a host of instabilitiesin the volatile mix surrounding its core (Figure 18). The simulations were madepossible by bigger and faster supercomputers and a new code that can takeadvantage of their power.The researchers performed these 3D simulations with approximately 4 millioncomputer processor hours on 8,000 to 16,000 cores of NERSC’s Cray XT4Franklin system, using a new multidimensional radiation/hydrodynamic codecalled CASTRO, the development of which was led by Almgren and Bell ofBerkeley Lab’s Center for Computational Sciences and Engineering.In the past, simulated explosions represented in one and two dimensions oftenstalled, leading scientists to conclude that their understanding of the physicswas incorrect or incomplete. Using the new CASTRO code on supercomputersmany times more powerful than their predecessors, this team was able to createFunding: HEP, SciDAC, NSFComputing Resources: NERSC,TIGRESS, NICS, TACC, ALCF7J. Nordhaus, A. Burrows, A. Almgren, and J. Bell, “Dimension as a key to the neutrino mechanism ofcore-collapse supernova explosions,” Astrophysical Journal 720, 694 (2010).
34 NERSC Annual Report 2010-2011supernova simulations in 3D thatwere ~50% easier to explode than in1D, and exploded significantly fasterthan in 2D. This dimensional effectis much larger than that of a varietyof other physical effects such asnuclear burning, inelastic scattering,and general relativity.“It may well prove to be the casethat the fundamental impedimentto progress in supernova theory overthe last few decades has not beenlack of physical detail, but lack ofaccess to codes and computers withwhich to properly simulate thecollapse phenomenon in 3D,” theteam wrote. “This could explain theagonizingly slow march since the1960s toward demonstrating arobust mechanism of explosion.”Burrows, a professor of astrophysicalsciences at Princeton, commented:“I think this is a big jump in ourunderstanding of how these thingscan explode. In principle, if youcould go inside the supernovaeto their centers, this is what youmight see.”Birth of a SupernovaSupernovae are the primary sourceof heavy elements in the cosmos.Most result from the death of singlestars much more massive thanthe sun.As a star ages, it exhausts thehydrogen and helium fuel at itscore. With still enough mass andpressure to fuse carbon and produceother heavier elements, it graduallybecomes layered like an onion withthe bulkiest tiers at its center.Once its core exceeds a certainmass, it begins to implode. In thesqueeze, the core heats up andgrows even denser.Figure 18. Isoentropy surface of the explosion of a massive star. The snapshot is taken~50 milliseconds after the explosion. The multi-dimensional turbulent character of theblast is manifest. Image: J. Nordhaus, A. Burrows, A. Almgren, and J. Bell.“Imagine taking something asmassive as the sun, then compactingit to something the size of the Earth,”Burrows said. “Then imagine thatcollapsing to something the sizeof Princeton.”What comes next is even moremysterious.At some point, the implosionreverses. Astrophysicists call it “thebounce.” The core material stiffensup, acting like what Burrows calls a“spherical piston,” emitting a shockwave of energy. Neutrinos, which areinert particles, are emitted too. Theshock wave and the neutrinos areinvisible.Then there is a massive and veryvisible explosion, and the star’s outerlayers are ejected into space (Figure19). This stage is what observers seeas the supernova. What’s left behindis an ultra-dense object called aneutron star. Sometimes, when anultramassive star dies, a black hole iscreated instead.Scientists have a sense of the stepsleading to the explosion, but there isno consensus about the mechanismof the bounce. Part of the difficultyis that no one can see what ishappening on the inside of a star.During this phase, the star looksundisturbed. Then, suddenly, a blastwave erupts on the surface. Scientistsdon’t know what occurs to make thecentral region of the star instantlyunstable. The emission of neutrinosis believed to be related, but no oneis sure how or why.
Research News35“We don’t know what the mechan -ism of explosion is,” Burrows said.“As a theorist who wants to getto root causes, this is a naturalproblem to explore.”Multiple ScientificApproachesThe scientific visualization employedby the research team is an interdisciplinaryeffort combining astrophysics,applied mathematics, andcomputer science. The endeavorproduces a representation, throughcomputer-generated images, ofthree-dimensional phenomena.In general, researchers employvisualization techniques with theaim of making realistic renderingsof quantitative information includingsurfaces, volumes, and light sources.Time is often an important componenttoo, allowing researchers tocreate movies showing a simulatedprocess in motion.temperature, pressure, gravitationalacceleration, and velocity.The calculations took monthsto process on supercomputersat NERSC, Princeton, andother centers.The simulations are not ends untothemselves, Burrows noted. Partof the learning process is viewingthe simulations and connectingthem to real observations. In thiscase, the most recent simulationsare uncannily similar to the explosivebehavior of stars in their deaththroes witnessed by scientists.In addition, scientists often learnfrom simulations and see behaviorsthey had not expected.“Visualization is crucial,” Burrowssaid. “Otherwise, all you have ismerely a jumble of numbers.Visualization via stills and moviesconjures the entire phenomenonand brings home what has happened.It also allows one to diagnose thedynamics, so that the event is notonly visualized, but understood.”To help them visualize theseresults, the team relied on HankChilds of the NERSC AnalyticsTeam. Nordhaus noted that Childsplayed an important role in helpingthe team create 3D renderingsof these supernovae.“The Franklin supercomputerat NERSC gives us the most bangfor the buck,” said Nordhaus.“This system not only has enoughcores to run our 3D simulations,but our code scales extremely wellon this platform. Having the codedevelopers, John Bell and AnnAlmgren, nearby to help us scaleour simulations on the supercomputeris an added bonus.”To do their work, Burrows andhis colleagues came up withmathematical values representingthe energetic behaviors of stars byusing mathematical representationsof fluids in motion—the same partialdifferential equations solved bygeophysicists for climate modelingand weather forecasting. To solvethese complex equations andsimulate what happens insidea dying star, the team used theCASTRO code, which took intoaccount factors that changed overtime, including fluid density,Time = 0.156 sTime = 0.290 s2000 kilometersTime = 0.201 sTime = 0.422 s2000 kilometersFigure 19. Evolutionary sequence of blastmorphology for four different times afterthe bounce (0.156, 0.201, 0.289, and0.422 seconds). The scale is more than2000 km on a side. Image: J. Nordhaus,A. Burrows, A. Almgren, and J. Bell.2000 kilometers 2000 kilometers
Research News37Innovative simulations reveal that Earth’s silica ispredominantly superficialSilica is one of the most common minerals on Earth. Not only does it makeup two-thirds of our planet’s crust, it is also used to create a variety ofmaterials from glass to ceramics, computer chips, and fiber optic cables.Yet new quantum mechanics results generated by a team of physicists fromOhio State University (OSU) and elsewhere show that this mineral onlypopulates our planet superficially—in other words, silica is relativelyuncommon deep within the Earth.Using several of the largest supercomputers in the nation, including NERSC’sCray XT4 Franklin system, the team simulated the behavior of silica in hightemperature,high-pressure environments that are particularly difficult to studyin a lab. These details may one day help scientists predict complex geologicalprocesses like earthquakes and volcanic eruptions. Their results were publishedin the May 25, 2010 edition of the Proceedings of the National Academy ofSciences (PNAS). 8“Silica is one of the simplest and most common minerals, but we still don’tunderstand everything about it. A better understanding of silica on a quantummechanicallevel would be useful to earth science, and potentially to industryas well,” says Kevin Driver, an OSU graduate student who was correspondingauthor of the paper. “Silica adopts many different structural forms at differenttemperatures and pressures, not all of which are easy to study in the lab.”Project: Modeling Dynamically andSpatially Complex MaterialsPI: John Wilkins, Ohio State UniversityOver the past century, seismology and high-pressure laboratory experiments haverevealed a great deal about the general structure and composition of the Earth.For example, such work has shown that the planet’s interior structure exists inthree layers called the crust, mantle, and core (Figure 20). The outer two layers—the mantle and the crust—are largely made up of silicates, minerals containingsilicon and oxygen. Still, the detailed structure and composition of the deepestparts of the mantle remain unclear. These details are important for complexgeodynamical modeling that may one day predict large-scale events, suchas earthquakes and volcanic eruptions.Funding: BES, NSFComputing Resources: NERSC, NCAR,TeraGrid, NCSA, OSC, CCNI8K. P. Driver, R. E. Cohen, Zhigang Wu, B. Militzer, P. López Ríos, M. D. Towler, R. J. Needs, andJ. W. Wilkins, “Quantum Monte Carlo computations of phase stability, equations of state, and elasticityof high-pressure silica,” PNAS 107, 9519 (2010).
38 NERSC Annual Report 2010-2011Liquid Outer CoreSolid Inner CoreMantleCrustFigure 20. Cross-section of the Earth.Driver notes that even the role thatthe simplest silicate—silica—playsin the Earth’s mantle is not wellunderstood. “Say you’re standing ona beach, looking out over the ocean.The sand under your feet is made ofquartz, a form of silica containingone silicon atom surrounded by fouroxygen atoms. But in millions ofyears, as the oceanic plate belowbecomes subducted and sinksbeneath the Earth’s crust, thestructure of the silica changesdramatically,” he says.As pressure increases with depth,the silica molecules crowd closertogether, and the silicon atomsstart coming into contact with moreoxygen atoms from neighboringmolecules. Several structuraltransitions occur, with low-pressureforms surrounded by four oxygenatoms and higher-pressure formssurrounded by six (Figure 21).With even more pressure, thestructure collapses into a very denseform of the mineral, which scientistscall the alpha-lead dioxide form.Driver notes that it is this formof silica that likely resides deepwithin the earth, in the lower part
Research News39of the mantle, just above the planet’score. When scientists try to interpretseismic signals from that depth,they have no direct way of knowingwhat form of silica they are dealingwith. They must rely on highpressureexperiments and computersimulations to constrain thepossibilities. Driver and colleaguesuse a particularly high-accuracy,quantum mechanical simulationmethod to study the thermodynamicand elastic properties of differentsilica forms, and then comparethe results to the seismic data.In PNAS, Driver, his advisor JohnWilkins, and their coauthors describehow they used a quantum mechanicalmethod to design computer algorithmsthat would simulate the silicastructures. When they did, theyfound that the behavior of the dense,alpha-lead dioxide form of silica didnot match up with any global seismicsignal detected in the lower mantle.This result indicates that the lowermantle is relatively devoid of silica,except perhaps in localized areaswhere oceanic plates have subducted.“As you might imagine, experimentsperformed at pressures near those ofEarth’s core can be very challenging.By using highly accurate quantummechanical simulations, we can offerreliable insight that goes beyond thescope of the laboratory,” says Driver.A New Application ofQuantum Monte CarloThe team’s work was one of the firstto show that quantum Monte Carlo(QMC) methods could be usedto study complex minerals. QMCdoes not rely on density functionalcalculations, in which approximationscan produce inaccurate results inthese kinds of studies; instead, QMC“This work demonstrates … that the quantumMonte Carlo method can compute nearly everyproperty of a mineral over a wide range ofpressures and temperatures.”explicitly treats the electrons andtheir interactions via a stochasticsolution of Schrödinger’s equation.Although QMC algorithms have beenused for over half a century tosimulate atoms, molecules, gases,and simple solids, Driver notes thatapplying them to a complex solidlike silica was simply too labor- andcomputer-intensive until recently.“In total, we used the equivalentof six million CPU hours to modelfour different states of silica. Threemillion of those CPU hours involvedusing NERSC’s Franklin systemto calculate a shear elastic constantfor silica with the QMC method.This is the first time it had everbeen done,” said Driver. The Franklinresults allowed him to measurehow silica deforms at differenttemperatures and pressures.“From computing hardware to theconsulting staff, the resources atNERSC are really excellent,” saysDriver. “The size and speed of thecenter’s machines is somethingthat I don’t normally have accessto at other places.”Wilkins comments, “This workdemonstrates both the superbcontributions a single graduatestudent can make, and that thequantum Monte Carlo methodcan compute nearly every propertyof a mineral over a wide range ofpressures and temperatures. Thestudy will stimulate a broader useof quantum Monte Carlo worldwideto address vital problems.”Wilkins and his colleagues expectthat quantum Monte Carlo will beused more often in materials sciencein the future, as the next generationof computers goes online.Coauthors on the paper includedRonald Cohen of the CarnegieInstitution of Washington; ZhigangWu of the Colorado School of Mines;Burkhard Militzer of the Universityof California, Berkeley; and PabloLópez Ríos, Michael Towler, andRichard Needs of the Universityof Cambridge.Figure 21. The structure of a six-foldcoordinated silica phase called stishovite,a prototype for many more complexmantle minerals. Stishovite is commonlycreated in diamond anvil cells and oftenfound naturally where meteorites haveslammed into earth. Driver and coauthorscomputed the shear elasticconstant softening of stishovite underpressure with quantum Monte Carlousing over 3 million CPU hours onFranklin. Image: Kevin Driver.
Research News41Nanotech research on graphene is blossoming at NERSCIn a 1959 lecture titled “There’s Plenty of Room at the Bottom,” physicistand future Nobel laureate Richard Feynman predicted that humans wouldcreate increasingly smaller and more powerful machines by tapping intothe quantum physics that drives atoms and molecules. Nearly 50 yearslater this prediction is realized in the field of materials nanoscience, wheresome researchers are designing materials atom by atom to build deviceswith unprecedented capabilities.In fact, assistant professor Petr Král and his team of graduate students fromthe University of Illinois at Chicago (UIC) are making headway in this arenaby using supercomputers at NERSC to investigate the molecular and atomicproperties of graphene—a single-atom layer of carbon, densely packed ina honeycomb pattern. Discovered in 2004, graphene is the most conductivematerial known to humankind and could be used to make the next generationof transistors, computer memory chips, and high-capacity batteries.Manipulating the “Miracle Material” with WaterElectrons blaze through graphene faster than in any other conductor, makingit ideal for use in a variety of high-tech devices. With 200 times the breakingstrength of steel, graphene is also stronger and stiffer than a diamond. Theapplication potential for this material is so promising that the two scientistswho discovered it, Andre Geim and Konstantin Novoselov, won the 2010Nobel Prize in Physics.Project: Multiscale Modeling ofNanofluidic SystemsPI: Petr Král, University of Illinois atChicagoFunding: BES, NSFComputing Resources: NERSCBut before this “miracle material” can be widely implemented in newtechnologies, researchers must figure out how to control its shape. That’sexactly what Král’s research team did. In an article published in Nano Letters 9and highlighted in Nature’s “News and Views” section, 10 Král and his formerUIC graduate students Niladri Patra and Boyang Wang reported that nanodropletsof water could be used to shape sheets of graphene. Using the scalable NAMD9Niladri Patra, Boyang Wang, and Petr Král, “Nanodroplet activated and guided folding of graphenenanostructures,” Nano Letters 9, 3766 (2009).10Vincent H. Crespi, “Soggy origami,” Nature 462, 858 (2009).
42 NERSC Annual Report 2010-2011code to study molecular dynamicsand SIESTA to calculate electronstructure on NERSC’s Franklinsystem, they found that nanodropletsof water remove the energy barrierthat prevents graphene from folding.a. b.Because atoms exposed on thesurface of materials tend to beless stable than those buried within,placing a water droplet on graphene—which is just a single layer ofcarbon atoms—will cause thematerial to deform so that thenumber of exposed atoms isminimized and the system’s energyis reduced. Although spontaneousfolding is not a characteristic uniqueto wet graphene, the computationalmodels also showed that researcherscould control how the materialdeforms. Being able to easily shapegraphene is essential to itsapplication in future technologies.c. d.Figure 22. A droplet of water (pink) causes the “petals” of a flower-shaped grapheneribbon (green) about 13 nanometers across to fold up around it. Image: N. Patra, B.Wang, and P. Král.According to Král, a graphenesheet will naturally wrap arounda water droplet. By positioningnanodroplets of water on graphenewith guided microscopic needles,scientists can make the carbonsheet roll, bend, and slide; thematerial can also be manipulatedinto a variety of complex formslike capsules, sandwiches, knots,and rings, to serve as the foundationfor nanodevices.The end product depends on thegraphene’s initial shape and the waterdroplet’s diameter. For example,placing a droplet of water in thecenter of the “petals” of a flowershapedgraphene ribbon causesthe “flower” to close (Figure 22).A droplet placed on one endof a graphene ribbon causesFigure 23. Placing a droplet of water onone end of a graphene ribbon causes thefree end to wrap around it, thus foldingthe ribbon in half. Image: N. Patra,B. Wang, and P. Král.a.b.c.d.e.Figure 24. Placing a graphene nanoribbonon a nanotube causes it to fold on thenanotube surface, even in the presenceof nonpolar solution. Image: N. Patra,Y. Song and P. Král.Controlling graphene’s shape and conductivity isessential to its application in future technologies.11N. Patra, Y. Song, and P. Král, “Self-assembly of graphene nanostructures on nanotubes,” ACS Nano (in press).
Research News43the free end to wrap around it,thus folding the ribbon in half(Figure 23). Another study, whichwill appear in a forthcoming issueof ACS Nano, shows that placinga graphene ribbon on a nanotubecauses the ribbon to wrap itselfaround the tube, even in solvents(Figure 24). 11E - E f[eV]630-3a.b. c.-60 k [π/a] 0.5 0 k [π/a] 0.1 0 k [π/a] 0.1210-1-20.70.0-0.7d.0 DOS [a.u]symasympristine8“Until these results it wasn’tthought that we could controllablyfold these structures,” says Král.“Now, thanks to NERSC, we knowhow to shape graphene usingweak forces between nanodropletsof water.”E - E f[eV]630-3-6e. f. g.210-1-20.70.0-0.7h.symasympristine0 k [π/a] 0.5 0 k [π/a] 0.1 0 k [π/a] 0.10DOS [a.u]25Manipulating theConductivity of GrapheneIn addition to shaping graphene,Král and another UIC graduatestudent, Artem Baskin, usedlarge-scale density functional theoryab initio calculations at NERSCto confirm a belief long held byscientists—that they couldmanipulate the conductivity ofgraphene by strategically insertingholes into it. 12 The simulationsrevealed that the arrangementof these pores and their size couldbe used to control graphene’sconductivity; and changing thedistance between these pores cancause the material to switch frommetal to semiconducting porousgraphene (Figure 25).In this project they explored theatomic characteristics of porousgraphene-based materials, likeFigure 25. These graphs show the energies of electronic states as a function of theirmomentum (k). When the curves touch each other in the center, the graphene is a metal;otherwise it is a semiconductor. The graphs on the right side show densities of states.AGNR is armchair graphene nanoribbon and ZGNR is a zigzag. Band structure is shownfor: (a) pristine 11-AGNR, (b) 11-AGNR with centered pore, (c) 11-AGNR with shiftedpore, (e) pristine 10-ZGNR, (f) 10-ZGNR with centered pore, (g) 10-ZGNR with shiftedpore. The notation 11- or 10- here means the width of the ribbons, namely, 11-AGNRand 10-ZGNR are nanoribbons of 5 honeycomb carbon rings width, as shown in theinsets of (d) and (h). Image: A. Baskin and P. Král.nanoribbons, superlattices, andnanotubes. Based on their results,Baskin and Král propose aphenomenological model that canbe used to predict the band structurefeatures of porous nanocarbonswithout performing the timeconsumingquantum mechanicalcalculations. The duo successfullytested this model on a numberof systems and recently submitteda paper with this recommendationto Physical Review Letters.The team used NERSC resourcesfor algorithm testing as well asmodeling. “NERSC resources havebeen indispensible for us,” saysKrál. “Our projects require longmultiprocessorcalculations,and access to the center’scomputation time was veryimportant for progressing ourresearch in a timely manner.”12A. Baskin and P. Král, “Electronic structures of porous nanocarbons,” Physical Review Letters (submitted).
Research News45Discovery opens the door to new dimensions ofantimatterThe strangest form of antimatter ever seen has been discovered by the STARexperiment at the Relativistic Heavy Ion Collider (RHIC) at DOE’s BrookhavenNational Laboratory in New York. 13 The new particle goes by the jawbreakingname “anti-hyper-triton.”Translated, that means a nucleus of antihydrogen containing one antiprotonand one antineutron—plus one heavy relative of the antineutron, an antilambdahyperon. It’s that massive antilambda that makes the newly discoveredantinucleus “hyper.”“STAR is the only experiment that could have found an antimatter hypernucleus,”says Nu Xu of Berkeley Lab’s Nuclear Science Division, the spokespersonfor the STAR experiment. “We’ve been looking for them ever since RHICbegan operations. The discovery opens the door on new dimensions of antimatter,which will help astrophysicists trace back the story of matter to the very firstmillionths of a second after the big bang.”Project: STAR Detector Simulations andData AnalysisPI: Grazyna Odyniec, Lawrence BerkeleyNational LaboratorySenior Investigators: Nu Xu, LBNL;James Dunlop, Brookhaven NationalLaboratory; Olga Barannikova, Universityof Illinios, Chicago; Bernd Surrow,Massachusetts Institute of TechnologyFunding: NP, HEP, NSF, SloanFoundation, CNRS/IN2P3, STFC, EPSRC,FAPESP–CNPq, MESRF, NNSFC, CAS,MoST, MoE, GA, MSMT, FOM, NWO,DAE, DST, CSIR, PMSHE, KRF, MSES,RMST, ROSATOMComputing Resources: NERSC, RCF,OSGRHIC was designed to collide heavy nuclei like gold together at very high energies.The fireballs that blossom in the collisions are so hot and dense that the protonsand neutrons in the nuclei are shattered and there is massive production ofpartons (Figure 26).For a brief instant there exists a quark-gluon plasma that models the firstmoments of the early universe, in which quarks—and gluons, which stronglybind quarks together at lower temperatures—are suddenly free to moveindependently. But as the quark-gluon plasma cools, its constituents recombinein a variety of ways.The commonest quarks, which are also the lightest, are the ups and downs;they make up the protons and neutrons in the nuclei of all the familiar ordinaryatoms around us. Heavier particles also condense out of the quark-gluonplasma, including the lambda, made of an up quark, a down quark, anda massive strange quark.13The STAR Collaboration, “Observation of an antimatter hypernucleus,” Science 238, 58 (2010).
46 NERSC Annual Report 2010-2011a.50 cmb.David Nygren; STAR’s TPC was builtthere and shipped to Brookhavenaboard a giant C-5A cargo plane.3HFigure 26. In a single collision of gold nuclei at RHIC, many hundreds of particles areemitted, most created from the quantum vacuum via the conversion of energy into massin accordance with Einstein’s famous equation E = mc 2 . The particles leave telltale tracksin the STAR detector (shown here from the end and side). Scientists analyzed about ahundred million collisions to spot the new antinuclei, identified via their characteristicdecay into a light isotope of antihelium and a positive pi-meson. Altogether, 70 examplesof the new antinucleus were found. Image: STAR Collaboration.The New Regionof Strange AntimatterVπ +3 3He He π +The standard Periodic Tableof Elements is arranged accordingto the number of protons, whichdetermine each element’s chemicalproperties. Physicists use a morecomplex, three-dimensional chartto also convey information on thenumber of neutrons, which maychange in different isotopes of thesame element, and a quantumnumber known as “strangeness,”which depends on the presence ofstrange quarks (Figure 27). Nucleicontaining one or more strangequarks are called hypernuclei.For all ordinary matter, with nostrange quarks, the strangenessvalue is zero and the chart is flat.Hypernuclei appear above the planeof the chart. The new discoveryof strange antimatter with anantistrange quark (an antihypernucleus)marks the first entrybelow the plane.“Ordinary” hypernuclei, in whicha neutron or two are replaced bylambdas or other hyperons, havebeen made in particle acceleratorssince the 1950s—not long after thefirst lambda particles were identifiedin cosmic ray debris in 1947.Quarks and antiquarks exist in equalnumbers in the quark-gluon plasma,so the particles that condense froma cooling quark-gluon plasma comenot only in ordinary-matter forms,but in their oppositely chargedantimatter forms as well.Antiprotons and antineutronswere discovered at Berkeley Lab’sBevatron in the 1950s. A fewantihydrogen atoms with a singleantiproton for a nucleus have evenbeen made, but no heavier formsof antihydrogen. A hydrogen nucleuswith one proton and two neutronsis called a triton. The STAR discoveryis the first time anyone has observedan antihypertriton.What made the discovery possiblewas the central element in the STARexperiment, its large time projectionchamber (TPC). The TPC conceptwas invented at Berkeley Lab by“What a time projection chamberdoes is allow three-dimensionalimaging of the tracks of everyparticle and its decay products ina single gold-gold collision,” Nu Xuexplains. “The tracks are curved bya uniform magnetic field, whichallows their original mass and energyto be determined by following themback to their origin. We can reconstructthe ‘topology of decay’ of achosen event and identify everyparticle that it produces.”So far the STAR collaboration hasidentified about 70 of the strangenew antihypertritons, along withabout 160 ordinary hypertritons,from 100 million collisions. Asmore data is collected, the STARresearchers expect to find evenheavier hypernuclei and antimatterhypernuclei, including alphaparticles, which are the nucleiof helium.Data Simulation,Storage, and AnalysisTo identify this antihypertriton,physicists used supercomputersat NERSC and other researchcenters to painstakingly sift throughthe debris of some 100 millioncollisions. The team also usedNERSC’s PDSF system to simulatedetector response. These resultsallowed them to see that all of thecharged particles within the collisiondebris left their mark by ionizingthe gas inside RHIC’s time projectionchamber, while the antihypertritonsrevealed themselves througha unique decay signature—the twotracks left by a charged pion andan antihelium-3 nucleus, the latter
Research News47being heavy and so losing energyrapidly with distance in the gas.-S6Hevv“These simulations were vitalto helping us optimize searchconditions such as topology of thedecay configuration,” says ZhangbuXu, a physicist at Brookhaven whois one of the STAR collaborators.“By embedding imaginary antimatterin a real collision andoptimizing the simulations for thebest selection conditions, we wereable to find a majority of thoseembedded particles.”3H2H3He 3 HvnZ4He3Hv v3He 4He1H 2H 3Hn1H6Li5He4Hv vv7Li6He 7He v vv8Liv9Li11B10Be7Be9Be6Li 7Li 8Li 9Li6He8HeNvv 8He10LivAs a STAR Tier 1 site, the PDSFstores a portion of the raw datafrom the experiment and all of theanalysis-ready data files. The STARdatabases at Brookhaven aremirrored to PDSF to provide directaccess for PDSF users. STARcurrently maintains around 800 TBof data in NERSC’s HPSS archive.Figure 27. The diagram above is known as the 3D chart of the nuclides. The familiarPeriodic Table arranges the elements according to their atomic number, Z, whichdetermines the chemical properties of each element. Physicists are also concerned withthe N axis, which gives the number of neutrons in the nucleus. The third axis representsstrangeness, S, which is zero for all naturally occurring matter, but could be non-zero inthe core of collapsed stars. Antinuclei lie at negative Z and N in the above chart, and thenewly discovered antinucleus (red) now extends the 3D chart into the new region ofstrange antimatter. Image: STAR Collaboration.Great ExpectationsThe STAR detector will soon beupgraded, and as the prospectsfor determining precise ratios forthe production of various speciesof strange hypernuclei and theirantimatter counterparts increase,so does the expectation that STARwill be able to shed new light onfundamental questions of physics,astrophysics, and cosmology,including the composition of collapsedstars and the ratio of matterand antimatter in the early universe.will help us distinguish betweenmodels that describe these exoticstates of matter.”“Understanding precisely how andwhy there’s a predominance ofmatter over antimatter remains amajor unsolved problem of physics,”says Zhangbu Xu. “A solution willrequire measurements of subtledeviations from perfect symmetrybetween matter and antimatter,and there are good prospects forfuture antimatter measurementsat RHIC to address this key issue.”Why there is more matter thanantimatter in the universe is oneof the major unsolved problemsin science. Solving it could tellus why human beings, indeedwhy anything at all, exists today.“The strangeness value could benonzero in the core of collapsedstars,” says Jinhui Chen of theSTAR collaboration, a postdoctoralresearcher at Kent State Universityand staff scientist at the ShanghaiInstitute of Applied Physics, “sothe present measurements at RHICSubtle deviations from perfect symmetry will helpexplain why there is more matter than antimatter.
Research News49A new database helps scientists discover how proteinsdo the work of lifeAll organisms from the largest mammals to the smallest microbes are composedof proteins—long, non-branching chains of amino acids. In the human body,proteins play both structural and functional roles, making up muscle tissueand organs, facilitating bone growth, and regulating hormones, in additionto various other tasks.Scientists know that the key to a protein’s biological function depends on theunique, stable, and precisely ordered three-dimensional shape that it assumesjust microseconds after the amino acid chain forms. Biologists call this the“native state,” and suspect that incorrectly folded proteins are responsiblefor illnesses like Alzheimer’s disease, cystic fibrosis, and numerous cancers.Over the years, NMR spectroscopy and X-ray crystallography experiments haveprovided a lot of information about the native state structure of most proteinfolds, but the scientific rules that determine their functions and cause themto fold remain a mystery. Researchers expect that a better understandingof these rules will pave the way for breakthroughs in medical applications,drug design, and biophysics.This is where Valerie Daggett, professor of Bioengineering at the Universityof Washington, Seattle, believes supercomputers can help. Over the last sixyears, Daggett used a total of 12.4 million processor hours at NERSC to performmolecular dynameomics simulations. This technique combines moleculardynamics (the study of molecular and atomic motions) with proteomics (thestudy of protein structure and function) to characterize the folding and unfoldingpathways of proteins. The culmination of this work has been incorporated intoa public database, http://www.dynameomics.org/.Project: Molecular Dynameomics“We already have several examples of dynamic behavior linked to function thatwas discovered through molecular dynamic simulations and not at all evidentin the static average experimental structures,” Daggett says in a June 2010Nature Methods article. 14PI: Valerie Daggett, University ofWashington, SeattleFunding: BER, NIH, MicrosoftComputing Resources: NERSC14Allison Doerr, “A database of dynamics,” Nature Methods 7, 426 (2010).
50 NERSC Annual Report 2010-2011Target SelectionFold FamiliesPreparationInputMinimizationAddHydrogensSimulaitionNative (298K, 310K)Marc van der Kamp, a postdoctoralresearcher in Daggett’s group, alongwith a dozen co-authors in a paperfeatured on the cover of the April 14,2010 issue of Structure (Figure 29). 2SNPsSLIRPBackbone PropensityRotamerFragmentsSolvationRMSDSASACustomDynameomicsDataWarehouseAnalysisContactsΦ-ΨDenaturing (498K)Data MiningStructural MiningFlexabilityWaveletsA protein will easily unfold, ordenature, when enough heat isapplied. This unfolding processfollows the same transitions andintermediate structures as folding,except in reverse. To simulate thisprocess computationally, Daggett’steam uses the “in lucem MolecularDynamics” (ilmm) code, which wasdeveloped by David Beck, DarwinAlonso, and Daggett, to solveNewton’s equations of motion forevery atom in the protein moleculeand surrounding solvent. Thesesimulations map the conformationalchanges of a protein with atomicleveldetail, which can then betested with experiments.Figure 28. Overview of the generation and use of molecular dynameomics simulation data,organized through the Dynameomics Data Warehouse. The repository include a StructuralLibrary of Intrinsic Residue Propensities (SLIRP). Image: Daggett Group.Unraveling Proteins withComputer SimulationsAlthough the Protein Data Bank (PDB)of experimentally derived proteinstructures has been vital to manyimportant scientific discoveries, thesedata do not tell the whole story ofprotein function. These structures aresimply static snapshots of proteinsin their native, or folded, state. Inreality, proteins are constantly moving,and these motions are critical to theirfunction. This is why Daggett’s teamwanted to create a complementarydatabase of molecular dynamicsstructures for representatives of allprotein folds, including their unfoldingpathways. The Dynameomics DataWarehouse provides an organizingframework, a repository, and a varietyof access interfaces for simulationand analysis data (Figure 28).“With the simulation data storedin an easily queryable structuredrepository that can be linked toother sources of biological andexperimental data, current scientificquestions can be addressed in waysthat were previously impossibleor extremely cumbersome,” writesTo determine the trajectories ofprotein unfolding, Daggett’s teamsimulates each protein at least sixtimes—once to explore the moleculardynamics of a native state for at least51 nanoseconds at a temperature of298 K (77° F), and five more timesto study the dynamics of protein at498 K (437° F), with two simulationsof at least 51 ns and three shortsimulations of at least 2 ns each.Daggett’s group applies a carefulstrategy to select diverse proteinfolds to analyze. Using three existingprotein fold classification systems—SCOP, CATH, and Dali—they identifydomains for proteins availablein the PDB and cluster them into“metafolds.” 3 The team hassimulated at least one member from2M. W. van der Kamp, R. D. Schaeffer, A. L. Jonsson, A. D. Scouras, A. M. Simms, A. D. Toofanny, N. C. Benson, P. C. Anderson, E. D. Merkley, S. Rysavy,D. Bromley, D. A. C. Beck, and V. Daggett, “Dynameomics: A comprehensive database of protein dynamics,” Structure 18, 423 (2010).3R. Dustin Schaeffer, Amanda L. Jonsson, Andrew M. Simms, and Valerie Daggett, “Generation of a consensus protein domain dictionary,” Bioinformatics 27, 46 (2011).
Research News51each protein metafold. Althoughexperimental dynamic informationis not available for most of theirtargets, Daggett notes that thesimulations do compare well whereexperimental data are available.Daggett began running thesesimulations at NERSC in 2005,when she received a Departmentof Energy INCITE allocation of2 million processor hours on thefacility’s IBM SP “Seaborg” system.In that first year, she modeled thefolding pathways of 151 proteinsin approximately 906 simulations.Subsequently her team receivedadditional allocations from DOE’sOffice of Biological and EnvironmentalResearch, as well as NERSCdiscretionary time, to continuetheir successful efforts. As of 2010,the group has completed over 11,000individual molecular dynamicssimulations on more than 2,000proteins from 218 different organisms.Their complete database ofsimulations contains over 340microseconds of native and unfoldingsimulations, stored as >10 8 individualstructures, which is four orders ofmagnitude larger than the PDB.“This is a data-mining endeavorto identify similarities anddifferences between native andunfolded states of proteins acrossall secondary and tertiary structuretypes and sequences,” saysDaggett. “It is necessary tosuccessfully predict native statesof proteins, in order to translatethe current deluge of genomicinformation into a form appropriatefor better functional identificationof proteins and for drug design.”Traditional drug design strategies tendto focus on native-state structure alone.But the Dynameomics Data Warehouseallows researchers to systematicallyFigure 29. The plane of structures on thebottom shows sample fold-representativesthat were simulated. One exampleof a simulated protein, Protein-tyrosinephosphatase 1B (PDB code 2HNP),is highlighted above. Cover image:Marc W. van der Kamp, Steven Rysavy,and Dennis Bromley. ©2010, Elsevier Ltd.search for transient conformationsthat would allow the binding of drugsor chemical chaperones. Thesemolecules could then stabilize theprotein structure without interferingwith function. This kind ofinformation is readily availablethrough simulation but not throughstandard experimental techniques.Some of the proteins in theDynameomics Data Warehouse includesingle-nucleotide polymorphismcontainingmutant human proteinsimplicated in disease. In some cases,the computer simulations providedvaluable insights into why certainmutations caused structuraldisruptions at the active site. Thedatabase also includes proteins fromthermophilic organisms, which thrivein scorching environments. By comparingthe native state and unfoldingdynamics of thermophiles to theirmesophilic counterparts, whichthrive in moderate temperatures,researchers hope to discover themechanisms that make thermophilesso stable at high temperatures.“There are approximately 1,695known non-redundant folds, of which807 are self-contained autonomousfolds, and we have simulatedrepresentatives for all 807,” saysDaggett. “We couldn’t have come thisfar without access to DOE resourcesat NERSC, and with their continuedsuport we will simulate more targetsfor well-populated metafolds toexplore the sequence determinantsof dynamics and folding.”DynameomicsData Suggests NewTreatment Strategyfor Amyloid DiseasesOne of the most exciting applicationsof Dynameomics involves amyloids,the insoluble, fibrous protein depositsthat play a role in Alzheimer’sdisease, type 2 diabetes, Parkinson’sdisease, heart disease, rheumatoidarthritis, and many other illnesses.The Daggett group has mined thesimulations of amyloid producingproteins and found a commonstructural feature in the intermediatestages between normal and toxicforms. Compounds with physicalproperties complementary to thesestructural features were designedon computers and synthesized.Experiments showed that thesecustom-designed compounds inhibitamyloid formation in three differentprotein systems, including thoseresponsible for (1) Alzheimer’sdisease, (2) Creutzfeldt-Jakob diseaseand mad cow disease, and (3)systemic amyloid disease and heartdisease. The experimental compoundspreferentially bind the toxic forms ofthese proteins. These results suggestit may be possible to design drugs thatcan treat as many as 25 amyloiddiseases, to design diagnostic toolsfor amyloid diseases in humans,and to design tests to screen ourblood and food supplies. A patentapplication for this technology is inpreparation, and further experimentsare being planned.
52 NERSC Annual Report 2010-2011NISE ProgramEncourages InnovativeResearchThe NERSC Initiative for Scientific Exploration (NISE) program providesspecial allocations of computer time for exploring new areas of research,such as:• A new research area not covered by the existing ERCAP proposal: thiscould be a tangential research project or a tightly coupled supplementalresearch initiative.• New programming techniques that take advantage of multicore computenodes by using OpenMP, Threads, UPC or CAF: this could includemodifying existing codes, creating new applications, or testing theperformance and scalability of multicore programming techniques.• Developing new algorithms that increase researchers’ scientificproductivity, for example by running simulations at a higher scaleor by incorporating new physics.Twenty-seven projects received NISE awards in 2010 (see sidebar). Althoughmany NISE projects are exploratory, some have already achieved notablescientific results. A few of those are described briefly on the following pages.
Research News53Modeling the Energy Flowin the OzoneRecombination ReactionThe ozone-oxygen cycle is theprocess by which ozone (O 3) andoxygen (O 2) are continuouslyconverted back and forth from one tothe other in Earth’s stratosphere.This chemical process converts solarultraviolet radiation into heat andgreatly reduces the amount ofharmful UV radiation that reachesthe lower atmosphere. The NISEproject “Modeling the Energy Flow inthe Ozone Recombination Reaction,”led by Dmitri Babikov of MarquetteUniversity, aims to resolve theremaining mysteries of whatdetermines the isotopic compositionof atmospheric ozone and toinvestigate unusual isotope effects inother atmospheric species. Theseeffects are poorly understood, mainlydue to computational difficultiesassociated with the quantumdynamics treatment of polyatomicsystems.Using their NISE allocation, Babikovand his collaborator Mikhail V. Ivanovhave developed and applied a mixedquantum-classical approach to thedescription of collisional energytransfer, in which the vibrationalmotion of an energized molecule istreated quantum mechanically usingwave packets, while the collisionalmotion of the molecule plusquencher and the rotational motionof the molecule are treated usingclassical trajectories. For typicalcalculations, the number of classicaltrajectories needed is between10,000 and 100,000, which can bepropagated on different processors,σ E (a 02 /cm-1 )σE(a 02 /cm-1 )σE(a 02 /cm-1 )10 210 010 -210 -410 -610 -810 -1010 -12#51 a.10 210 010 -210 -410 -610 -810 -1010 -1210 210 010 -210 -410 -610 -810 -1010 -12#53#54b.c.-1500 -1000 -500 0 500 1000E (cm -1 )Figure 30. Energy transfer functionsexpressed as the differential (over energy)cross sections. Three frames correspondto three initial states. Transition andstabilization cross sections are shown asblack circles and red dots, respectively.Presence of the van der Waals statesin the range −150
54 NERSC Annual Report 2010-2011at the high intensities requiredfor these applications. The project“First Principles Modeling ofCharged-Defect-Assisted LossMechanisms in Nitride LightEmitters,” led by EmmanouilKioupakis of the University ofCalifornia, Santa Barbara, isinvestigating why nitride lightemittingdevices lose their efficiencyat high intensities and suggestingways to fix this problem.Several mechanisms have beenblamed for this efficiency reduction,Absorption cross section (10 -18 cm 2 )4.03.02.01.00.00.0080.0060.0040.0020.0003.02.01.00.00.120.080.040.00Electrons cElectrons || cHoles cHoles || cFigure 31. The phonon- and chargeddefect-assistedabsorption cross-sectionspectra by free electrons and holes ofgallium nitride at room temperature areshown in (a) and (b), respectively. Thedirect absorption spectrum is plotted in(c) for holes and (d) for electrons. Thefour lines correspond to absorption byelectrons or holes, for light polarizedeither parallel or perpendicular to the caxis. Image: E. Kioupakis et al., ref. 19.a.b.c.d.400 450 500 550 600 650Photon wavelength (nm)including carrier leakage,recombination at dislocations,and Auger recombination, but theunderlying microscopic mechanismremains an issue of intense debate.Using atomistic first-principlescalculations, Kioupakis and hiscollaborators have shown thatcharge-defect-scattering is notthe major mechanism thatcontributes to the efficiencydegradation. Instead, theefficiency droop is caused byindirect Auger recombination,mediated by electron-phononcoupling and alloy scattering(Figure 31). 19, 20, 21 Due to theparticipation of phonons in theAuger process, increasingtemperature contributes to theperformance degradation. Moreover,the indirect Auger coefficientsincrease as a function of indiummole fraction and contribute to theefficiency reduction for devicesoperating at longer wavelengths (the“green gap” problem). By identifyingthe origin of the droop, these resultsprovide a guide to addressing theefficiency issues in nitride LEDs andto engineering highly efficientsolid-state lighting.Electron Transport forElectronic, Spintronic,and Photovoltaic DevicesWith the increasing ability tolithographically create atomic-scaledevices, a full-scale quantumtransport simulation is becomingnecessary for efficiently designingnanoscale electronic, spintronic,and photovoltaic devices. At thenanoscale, material characteristicschange significantly from theconventional due to pronouncedeffects of geometry and extremequantum confinement in variousdimensions. On the other hand,a full quantum mechanical simulationof the transport phenomena innanostructured materials, withminimum approximations, isextremely difficult computationally.The NISE project “Electron Transportin the Presence of Lattice Vibrationsfor Electronic, Spintronic, andPhotovoltaic Devices,” led by SayeefSalahuddin of the University ofCalifornia, Berkeley, aims tomassively parallelize the NESTcode for solving quantum transportproblems, including spin, using thenonequilibrium Green’s functionformalism. NEST can be used tosimulate transport in a wide varietyof nanostructures where quantumeffects are of significant importance,for example, nanoscale transistorsand memory devices, solar cells,and molecular electronics.Perfect scaling up to 8,192 processorcores has been achieved. Thismassive parallelization has enabledthe researchers to solve problemsthat had been deemed intractablebefore. This resulted in multiplepublications in leading journals,including a cover story in the July19, 2010 issue of Applied PhysicsLetters (Figure 32). 22The starting point for that studywas an experimental demonstrationthat a graphene nanoribbon (GNR)19E. Kioupakis, P. Rinke, A. Schleiffe, F. Bechstedt, and C. G. Van de Walle, “Free-carrier absorption in nitrides from first principles,” Phys. Rev. B 81,241201(R) (2010).20E. Kioupakis, P. Rinke, and C. G. Van de Walle, “Determination of internal loss in nitride lasers from first principles,” Appl. Phys. Express 3, 082101 (2010).21E. Kioupakis, P. Rinke, K. T. Delaney, and C. G. Van de Walle, “Indirect Auger recombination as a cause of efficiency droop in nitride LEDs,” submitted (2011).22Youngki Yoon and Sayeef Salahuddin, “Barrier-free tunneling in a carbon heterojunction transistor,” Applied Physics Letters 97, 033102 (2010).
Research News55Figure 32. The illustration shows the “local density of states” (the quantum mechanically viablestates) in the carbon heterojunction transistor under a gate bias when the transistor is turning ON.The black regions denote the band gap. The yellow regions show where a Klein tunneling is takingplace, thereby creating the unique situation that increases energy efficiency in this transistorbeyond classical limits. The inset shows the structure of the device and current-voltagecharacteristics. Image: ©2011, American Institute of Physics.can be obtained by unzipping acarbon nanotube (CNT). This makesit possible to fabricate all-carbonheterostructures that have a uniqueinterface between a CNT and a GNR.By performing a self-consistentnonequilibrium Green’s functionbased calculation on an atomisticallydefined structure, Salahuddinand collaborator Youngki Yoondemonstrated that such a heterojunctionmay be utilized to obtaina unique transistor operation.They showed that such a transistormay reduce energy dissipationbelow the classical limit while notcompromising speed—thus providingan alternative route toward ultralowpower,high-performance carbonheterostructureelectronics. Thismeans significantly lower needfor electrical power for plugged-inelectronics (currently estimatedto be 3% of total national electricityneeds), as well as a significantincrease in battery life for allportable devices.NERSC Initiative for Scientific Exploration (NISE) 2010 AwardsProject: Modeling the Energy Flow inthe Ozone Recombination ReactionPI: Dmitri Babikov, Marquette UniversityNISE award: 260,000 hoursResearch areas/applications: Atmosphericchemistry, chemical physics, climatescience, geochemistryProject: Bridging the Gaps Between Fluidand Kinetic Magnetic ReconnectionSimulations in Large SystemsPI: Amitava Bhattacharjee, Universityof New HampshireNISE award: 1,000,000 hoursResearch areas/applications: Plasmaphysics, astrophysics, fusion energyProject: Decadal Predictability in CCSM4PIs: Grant Branstator and Haiyan Teng,National Center for Atmospheric ResearchNISE award: 1,600,000 hoursResearch areas/applications: ClimatescienceProject: 3-D Radiation/HydrodynamicModeling on Parallel ArchitecturesPI: Adam Burrows, Princeton UniversityNISE award: 5,000,000 hoursResearch areas/applications: AstrophysicsProject: Dependence of SecondaryIslands in MagneticReconnection on Dissipation ModelPI: Paul Cassak, West Virginia UniversityNISE award: 200,000 hoursResearch areas/applications: Plasmaphysics, astrophysics, fusion energyProject: Simulation of Elastic Propertiesand Deformation Behavior of LightweightProtection Materials under High PressurePI: Wai-Yim Ching, University of Missouri,Kansas CityNISE award: 1,725,000 hoursResearch areas/applications: Materialsscience, protective materialsProject: Molecular Mechanismsof the Enzymatic Decompositionof Cellulose MicrofibrilsPI: Jhih-Wei Chu, Universityof California, BerkeleyNISE award: 2,000,000 hoursResearch areas/applications:Chemistry, biofuelsProject: Developing an Ocean-AtmosphereReanalysis for Climate Applications(OARCA) for the Period 1850 to PresentPI: Gil Compo, University of Colorado,BoulderNISE award: 2,000,000 hoursResearch areas/applications:Climate scienceProject: Surface Input Reanalysisfor Climate Applications 1850–2011PI: Gil Compo, University of Coloradoat BoulderNISE award: 1,000,000 hoursResearch areas/applications:Climate science
56NERSC Annual Report 2010-2011NISE 2010 Awards (continued)Project: Pushing the Limits of the GW/BSE Method for Excited-State Propertiesof Molecules and NanostructuresPI: Jack Deslippe, University ofCalifornia, BerkeleyNISE award: 1,000,000 hoursResearch areas/applications: Materialsscience, protective materialsProject: Thermodynamic, Transport, andStructural Analysis of Geosilicates usingthe SIESTA First Principles MolecularDynamics SoftwarePI: Ben Martin, University of California,Santa BarbaraNISE award: 110,000 hoursResearch areas/applications: GeosciencesProject: High Resolution ClimateSimulations with CCSMPIs: Warren Washington, Jerry Meehl,and Tom Bettge, National Centerfor Atmospheric ResearchNISE award: 3,525,000 hoursResearch areas/applications:Climate scienceProject: Reaction Pathways in MethanolSteam ReformationPI: Hua Guo, University of New MexicoNISE award: 200,000 hoursResearch areas/applications: Chemistry,hydrogen fuelProject: Warm Dense Matter SimulationsUsing the ALE-AMR CodePI: Enrique Henestroza, LawrenceBerkeley National LaboratoryNISE award: 450,000 hoursResearch areas/applications: Plasmaphysics, fusion energyProject: Modeling the Dynamics ofCatalysts with the Adaptive KineticMonte Carlo MethodPI: Graeme Henkelman, Universityof Texas, AustinNISE award: 1,700,000 hoursResearch areas/applications: Chemistry,alternative energyProject: First-Principles Study of theRole of Native Defects in the Kineticsof Hydrogen Storage MaterialsPI: Khang Hoang, University of California,Santa BarbaraNISE award: 600,000 hoursResearch areas/applications: Materialsscience, hydrogen fuelProject: ITER Rapid Shutdown SimulationPI: Valerie Izzo, General AtomicsNISE award: 1,200,000 hoursResearch areas/applications:Fusion energyProject: First-Principles Modelingof Charged-Defect-Assisted LossMechanisms in Nitride Light EmittersPI: Emmanouil Kioupakis, Universityof California, Santa BarbaraNISE award: 900,000 hoursResearch areas/applications: Materialsscience, energy-efficient lightingProject: Computational Predictionof Protein-Protein InteractionsPI: Harley McAdams, Stanford UniversityNISE award: 1,880,000 hoursResearch areas/applications: Biologicalsystems scienceProject: Light Propagation inNanophotonics Structures Designedfor High-Brightness PhotocathodesPIs: Karoly Nemeth and KatherineHarkay, Argonne National LaboratoryNISE award: 1,300,000 hoursResearch areas/applications: Materialsscience, high-brightness electron sourcesProject: International Linear Collider(ILC) Damping Ring Beam InstabilitiesPI: Mauro Pivi, Stanford LinearAccelerator CenterNISE award: 250,000 hoursResearch areas/applications:Accelerator physicsProject: Thermonuclear Explosionsfrom White Dwarf MergersPI: Tomasz Plewa, Florida StateUniversityNISE award: 500,000 hoursResearch areas/applications: AstrophysicsProject: Beam Delivery SystemOptimization of Next-Generation X-RayLight SourcesPIs: Ji Qiang, Robert Ryne, and Xiaoye Li,Lawrence Berkeley National LaboratoryNISE award: 1,000,000 hoursResearch areas/applications:Accelerator physicsProject: Electron Transport in thePresence of Lattice Vibrations forElectronic, Spintronic, and PhotovoltaicDevicesPI: Sayeef Salahuddin, Universityof California, BerkeleyNISE award: 600,000 hoursResearch areas/applications: Materialsscience, ultra-low-power computingProject: Abrupt Climate Changeand the Atlantic MeridionalOverturning CirculationPI: Peter Winsor, University of Alaska,FairbanksNISE award: 2,000,000 hoursResearch areas/applications:Climate scienceProject: Modeling Plasma SurfaceInteractions for Materials underExtreme EnvironmentsPI: Brian Wirth, Universityof California, BerkeleyNISE award: 500,000 hoursResearch areas/applications:Fusion energyProject: Models for the Explosionof Type Ia SupernovaePI: Stan Woosley, University of California,Santa CruzNISE award: 2,500,000 hoursResearch areas/applications: Astrophysics
58 NERSC Annual Report 2010-2011NERSC Users’Awards and HonorsNational Medal of ScienceWarren WashingtonNational Center forAtmospheric ResearchFellows of the American Academyof Arts and SciencesAndrea Louise BertozziUniversity of California, Los AngelesAdam Seth BurrowsPrinceton UniversityGary A. GlatzmaierUniversity of California, Santa CruzMember of the National Academyof SciencesGary A. GlatzmaierUniversity of California, Santa CruzFellows of the American Associationfor the Advancement of ScienceLiem DangPacific Northwest National LaboratoryAlenka LuzarVirginia Commonwealth UniversityGregory K. SchenterPacific Northwest National LaboratoryYousef SaadUniversity of MinnesotaChris G. Van de WalleUniversity of California, Santa BarbaraFellows of the American PhysicalSociety (APS)Mark AstaUniversity of California, DavisDavid BruhwilerTech-X CorporationLiem DangPacific Northwest National LaboratoryFrancois GygiUniversity of California, DavisJulius JellinekArgonne National LaboratoryEn MaJohns Hopkins UniversityBarrett RogersDartmouth CollegePhilip SnyderGeneral AtomicsRamona VogtLawrence LivermoreNational LaboratoryFuqiang WangPurdue UniversityMartin WhiteUniversity of California, Berkeley andLawrence Berkeley National LaboratoryZhangbu XuBrookhaven National LaboratoryAPS James Clerk Maxwell Prize forPlasma PhysicsJames DrakeUniversity of MarylandAPS John Dawson Awardfor Excellence in PlasmaPhysics ResearchEric EsareyLawrence Berkeley National LaboratoryCameron GeddesLawrence Berkeley National LaboratoryCarl SchroederLawrence Berkeley National LaboratoryAmerican Chemical Society(ACS) Ahmed Zewail Prizein Molecular SciencesWilliam H. MillerUniversity of California, Berkeley andLawrence Berkeley National LaboratoryPeter Debye Award in PhysicalChemistryGeorge C. SchatzNorthwestern UniversityFellow of the MaterialsResearch SocietyChris Van de WalleUniversity of California, Santa BarbaraIEEE Sidney Fernbach AwardJames DemmelUniversity of California, Berkeley andLawrence Berkeley National Laboratory
Research News59Every year a significant number of NERSC users arehonored for their scientific discoveries and achievements.Listed below are some of the most prominent awardsgiven to NERSC users in 2010.Fellows of the Society for Industrialand Applied Mathematics (SIAM)Andrea Louise BertozziUniversity of California, Los AngelesYousef SaadUniversity of MinnesotaSIAM Junior Scientist PrizeKamesh MadduriLawrence Berkeley National LaboratoryInternational Council for Industrialand Applied Mathematics (ICIAM)Pioneer PrizeJames DemmelUniversity of California, Berkeley andLawrence Berkeley National LaboratoryAcademician of Academia SinicaInez Yau-Sheung FungUniversity of California, Berkeley andLawrence Berkeley National LaboratoryPresidential Early Career Award forScientists and Engineers (PECASE)Cecilia R. AragonLawrence Berkeley National LaboratoryJoshua A. BreslauPrinceton Plasma Physics LaboratoryU.S. Air Force Young InvestigatorResearch ProgramPer-Olof PerssonUniversity of California, Berkeley andLawrence Berkeley National LaboratoryGordon Bell PrizeAparna ChandramowlishwaranLawrence Berkeley National LaboratoryLogan MoonGeorgia Institute of TechnologyAssociation for Computing Machinery(ACM) Distinguished ScientistsWu-chun FengVirginia Polytechnic Instituteand State UniversityNASA Public Service GroupAchievement AwardJulian BorrillLawrence Berkeley National LaboratoryChristopher CantalupoLawrence Berkeley National LaboratoryTheodore KisnerLawrence Berkeley National LaboratoryKesheng (John) WuLawrence Berkeley National Laboratory
NERSC Center61The NERSC CenterNERSC is the primary computing center for the DOE Office of Science, servingapproximately 4,000 users, hosting 400 projects, and running 700 codesfor a wide variety of scientific disciplines. NERSC has a tradition of providingsystems and services that maximize the scientific productivity of its usercommunity. NERSC takes pride in its reputation for the expertise of its staffand the high quality of services delivered to its users. To maintain itseffectiveness, NERSC proactively addresses new challenges in partnershipwith the larger high performance computing (HPC) community.The year 2010 was a transformational year for NERSC, with an unprecedentednumber of new systems and upgrades, including the acceptance of HopperPhase 1; the installation of Hopper Phase 2, the Carver and Magellan clusters,the Euclid analytics server, and the Dirac GPU testbed; upgrades of the NERSCGlobal Filesystem and the tape archive; and a major upgrade to the OaklandScientific Facility power supply. We also made significant improvements inthe Center’s energy efficiency.At the same time, NERSC staff were developing innovative ways to improvethe usability of HPC systems, to increase user productivity, and to explorenew computational models that can make the most effective use of emergingcomputer architectures.The following pages describe some of the ways NERSC is meeting currentchallenges and preparing for the future.
62 NERSC Annual Report 2010-2011Hopper PowersPetascale ScienceThe installation of NERSC’s newestsupercomputer, Hopper (Figure 1),marks a milestone—it is NERSC’sfirst petascale computer, postinga performance of 1.05 petaflops(quadrillions of calculations persecond) running the Linpackbenchmark. That makes Hopper,a 153,216-processor-core CrayXE6 system, the fifth most powerfulsupercomputer in the world and thesecond most powerful in the UnitedStates, according to the November2010 edition of the TOP500 list,the definitive ranking of the world’stop computers.NERSC selected the Cray systemin an open competition, in whichvendors ran NERSC applicationbenchmarks from several scientificdomains that use a variety ofalgorithmic methods. The Cray systemshowed the best application performanceper dollar and per megawattof power. It features external loginnodes and an external filesystem forincreased functionality and availability.Hopper was installed in two phases.The first phase, a 5,344-core CrayXT5, was installed in October 2009and entered production use in March2010. Phase 1 helped NERSCstaff optimize the external nodearchitecture. “Working out thekinks in Phase 1 will ensure a morerisk-free deployment when Phase 2arrives,” said Jonathan Carter, wholed the Hopper system procurementas head of NERSC’s User ServicesGroup. “Before accepting the Phase1 Hopper system, we encouragedall 300 science projects computingat NERSC to use the system duringthe pre-production period to seewhether it could withstand thegamut of scientific demands thatwe typically see.”The new external login nodes onHopper offer users a more responsiveenvironment. Compared to NERSC’sCray XT4 system, called Franklin,the external login nodes have morememory, and in aggregate, havemore computing power. This allowsusers to compile applications fasterand run small post-processing orvisualization jobs directly on thelogin nodes without interferencefrom other users. Even withdedicated testing and maintenancetimes, utilization of Hopper-1 fromDecember 15 to March 1 reached90%, and significant scientific resultswere produced even during the testingperiod, including high-resolutionsimulations of magnetic reconnection,and simulations of peptideaggregation into amyloid fibrils.Hopper Phase 2, the Cray XE6, wasfully installed in September 2010and opened for early use and testinglate in the year. It has 6,384 nodes,with two 12-core AMD “Magny-Cours” chips per node, and featuresCray’s high-bandwidth, low-latencyGemini interconnect. With 168 GB/sec routing capacity and internodelatency on the order of 1 microsecond,Gemini supports millions of MPImessages per second, as well asOpenMP, Shmem, UPC, and Co-ArrayFortran. The innovative ECOphlexliquid cooling technology increasesHopper’s energy efficiency.Because Hopper has 2 PB of diskspace and 70 GB/sec of bandwidthon the external filesystem, users withextreme data demands are seeingfewer bottlenecks when they movetheir data in and out of the machine.Additionally, the availability ofdynamically loaded libraries enableseven more applications to run on thesystem and adds support for popularframeworks like Python. This featurehelps ensure that the system isoptimized for scientific productivity.
NERSC Center63Figure 1. NERSC’s newest supercomputeris named after American computerscientist Grace Murray Hopper, a pioneerin the field of software development andprogramming languages who created thefirst compiler. She was a champion forincreasing the usability of computers,understanding that their power and reachwould be limited unless they were mademore user-friendly. Cabinet façade designby Caitlin Youngquist, Berkeley LabCreative Services Office. Grace Hopperphoto courtesy of the Hagley Museumand Library, Wilmington, Delaware.Other New Systemsand UpgradesCarver and Magellan ClustersThe Carver and Magellan clusters(Figure 2), built on liquid-cooledIBM iDataPlex technology, wereinstalled in March 2010. Carver,named in honor of Americanscientist and inventor GeorgeWashington Carver, enteredproduction use in May, replacingNERSC’s Opteron Jacquard clusterand IBM Power5 Bassi system,which were both decommissionedFigure 2. Top view of the Carverand Magellan clusters showing traysof InfiniBand cables. Photo: RoyKaltschmidt, Berkeley Lab Public Affairs.at the end of April.Carver contains 800 Intel Nehalemquad-core processors, or 3,200 cores.With a theoretical peak performanceof 34 teraflops, Carver has 3.5 timesthe capacity of Jacquard and Bassicombined, yet uses less space andpower. The system’s 400 computenodes are interconnected by thelatest 4X QDR InfiniBand technology,meaning that 32 GB/s of point-topointbandwidth is available for highperformance message passing andI/O. Carver is one of the first platformsof this scale to employ four Voltaireswitches, allowing information toquickly transfer between themachine and its external filesystem.NERSC staff have configuredCarver’s batch queue and scratchstorage to handle a range of largeand small jobs, making CarverNERSC’s most popular system.A queue that allows use of up to32 processors now has the optionto run for 168 hours, or sevenstraight days.Early scientific successes on Carverincluded calculations of the atomicinteractions of titanium, othertransition metals, and their alloys;and computational modeling of datafrom x-ray crystallography andsmall-angle x-ray scattering thatreveals how protein atoms interactinside of a cell.The Magellan cluster, with hardwareidentical to Carver’s but differentsoftware, is a testbed system andwill be discussed below on page 68.Euclid Analytics Serverand Dirac GPU TestbedThe Euclid analytics server andDirac GPU testbed were installedin May 2010. Euclid, named in honorof the ancient Greek mathematician,is a Sun Microsystems Sunfire x4640SMP. Its single node contains eightsix-core Opteron 2.6 GHz processors,with all 48 cores sharing the same512 GB of memory. The system’stheoretical peak performance is499.2 Gflop/s. Euclid supports awide variety of data analysis andvisualization applications.Dirac, a 50-node hybrid CPU/GPUcluster with Intel Nehalem andNVIDIA Fermi chips, is anexperimental system dedicatedto exploring the performanceof scientific applications ona GPU (graphics processing unit)architecture. Research conductedon Dirac is discussed on page 71.
64 NERSC Annual Report 2010-2011NERSC Global Filesystem andHPSS Tape Archive Upgrades“Historically, each new computingplatform came with its own distinctstorage hardware and software,creating isolated storage ‘islands,’”wrote NERSC Storage Systems Grouplead Jason Hick in HPC Source, asupplement to Scientific Computingmagazine. “Users typically haddifferent filesystem quotas on eachmachine and were responsible forkeeping track of their data, fromdetermining where the data was totransferring files between machines.” 1In recent years, however, NERSCbegan moving from this “island”model toward global storage resources.The center was among the first toimplement a shared filesystem,called the NERSC Global Filesystem(NGF), which can be accessed fromany of NERSC’s major computingplatforms. This service facilitates filesharing between platforms, as wellas file sharing among NERSC usersworking on a common project. Thisyear, to allow users to access nearlyall their data regardless of the systemthey log into, NGF was expandedfrom project directories to includescratch and home directories aswell. (Project directories provide alarge-capacity file storage resourcefor groups of users; home directoriesare used for permanent storageof source code and other relativelysmall files; and scratch directoriesare used for temporary storage oflarge files.) With these tools, usersspend less time moving data betweensystems and more time using it.Another aspect of NERSC’s storagestrategy is a multi-year effort toimprove the efficiency of the HighTotal Number of Tapes50k45k40k35k30k25k20k15k10k5k0Figure 3. Number of HPSS tapes, July 2009 to August 2010.Terabytes15k14k13k12k11k10k9k8k7k6k5k4k3k2k1k0Jul 00ArchiveHpss/RegentFeb 01Aug 01Feb 02Aug 02Feb 03Performance Storage System (HPSS)tape archive by upgrading to highercapacitymedia. In just the past year,the number of tapes has beenreduced from 43,000 to 28,000(Figure 3), while still providing60% per year growthAug 03Feb 04Aug 049840DT10KBT10KA9940B9840AJul 09Aug 09Sep 09Oct 09Nov 09Dec 09Jan 10Feb 10Mar 10Apr 10May 10Jun 10Jul 10Feb 05Aug 05Feb 06Month and YearFigure 4. Cumulative storage by month and system, January 2000 to August 2010,in terabytes.Aug 06Feb 07Aug 07Feb 08Aug 08Feb 09Aug 09Feb 10Aug 10capacity for the continuing exponentialgrowth in data (Figure 4). Disk cachecapacity and bandwidth have beendoubled, and NERSC has invested inHPSS client software improvementsto accommodate future increases.1Jason Hick, “Big-picture storage approach,” HPC Source, Summer 2010, pp. 5–6.
NERSC Center65Innovations to IncreaseNERSC’s Energy EfficiencyIn July 2010, NERSC scheduleda four-day, center-wide outage forinstallation of a 3 megawatt powersupply upgrade to the computerroom. The total 9 megawatt capacitywas needed to accommodate HopperPhase 2 along with existing andfuture systems.But perhaps more important than theincreased power supply were NERSC’sinnovative solutions to improveenergy efficiency, which has becomea major issue for high performancecomputing and data centers. Theenergy consumed by data centerservers and related infrastructureequipment in the United States andworldwide doubled between 2000and 2005 and is continuing to grow.A 2007 Environmental ProtectionAgency report to Congress estimatedthat in 2006, data centers consumedabout 1.5 percent of total U.S.electricity consumption—equivalentto 5.8 million average households—and that federal servers and datacenters accounted for 10 percentof that figure.Cost and carbon footprint are criticalconcerns, but performance issuffering, too. In an interview inHPC Source, NERSC Director KathyYelick said, “The heat, even at thechip level, is limiting processorperformance. So, even from theinnards of a computer system, weare worried about energy efficiencyand getting the most computingwith the least energy.” 2Two innovative projects to increaseenergy efficiency at NERSC areattracting attention throughoutthe HPC community: the innovativeconfiguration and cooling systemfor the Carver cluster, and theextensive energy monitoring systemnow operating in the NERSCmachine room.Carver Installation SavesEnergy, Space, WaterNERSC staff configured the IBMiDataplex system Carver in a mannerso efficient that, in some places, thecluster actually cools the air aroundit. By setting row pitch at 5 ft, ratherthan the standard 6 ft, and reusingthe 65° F water that comes out of theFranklin Cray XT4 to cool the system,the team was able to reduce coolingcosts by 50 percent and to use 30percent less floor space. NERSCcollaborated with vendor VetteCorporation to custom-designa cooling distribution unit (CDU)for the installation.By narrowing row pitch to 5 ftfrom cabinet-front to cabinet-front(a gap of only 30 inches betweenrows), the Carver team was ableto use only one “cold row” to feedthe entire installation. (A standardinstallation would require coolingvents along every other row.) Atsuch a narrow pitch, the IBMiDataplex’s transverse coolingarchitecture recirculates cool airfrom the previous row while the airis still at a low temperature. Whenthe air passes through Carver’swater-cooled back doors (the coolwater coming from Franklin), it isfurther cooled and passed to thenext row. When the air finally exitsthe last row of Carver, it can beseveral degrees cooler than whenit entered.Using a design concept from NERSC,Vette Corporation custom-designed aCDU that reroutes used water comingout of Franklin back into Carver.Water from the building’s chillersenters Franklin’s cooling system at52° F. The water exits the system atabout 65° F. Sending the water backto be chilled at such low temperaturesis inefficient. Instead, the customdesignedCDU increases efficiencyby rerouting the water through Carverto pick up more heat, sending waterback to the chillers at 72° F andenabling one CDU to cool 250 kWof computing equipment.A case study published by IBM, titled“NERSC creates an ultra-efficientsupercomputer,” observed:The sheer size of the reardoors—which measure fourfeet by seven feet—meansthat the air can move moreslowly through them and hasmore time to exchange heatwith the water. This in turnenables slower fans, dramaticallyreducing the energy wastedas noise, and making theCarver cluster the quietestplace in the machine room.NERSC has shared these innovationswith the wider HPC community andwith IBM, which now uses NERSCtechniques in new installations; andVette is working to turn this customdesign into a commercial product.Monitoring Energy Across theCenter to Improve EfficiencyNERSC has instrumented itsmachine room with state-of-the-artwireless monitoring technology fromSynapSense to study and optimize2Mike May, “Sensing the future of greener data centers,” HPC Source, Autumn 2010, pp. 6-8.
66 NERSC Annual Report 2010-2011energy use. The center has installed992 sensors, which gather informationon variables important to machineroom operation, including airtemperature, pressure, and humidity.This data is collected by NERSC’sSynapSense system software, whichgenerates heat, pressure, andhumidity maps (Figure 5).“Just a glance at one of thesemaps tells you a lot,” says Jim Craw,NERSC’s newly appointed Risksand Energy Manager. “If you startdeveloping hot or cold spots, you’regetting an unambiguous visual cuethat you need to investigate and/ortake corrective action.”The system can also calculate thecenter’s power usage effectiveness,an important metric that comparesthe total energy delivered to a datacenter with that consumed by itscomputational equipment.Responding quickly to accuratemachine room data not onlyenhances cooling efficiency, it helpsassure the reliability of centersystems. For example, after cabinetsof the decommissioned Bassi systemwere shut down and removed, coldair pockets developed near Franklin’sair handling units. The SynapSensesystem alerted administrators, whowere able to quickly adjust fans andchillers in response to these changesin airflow and room temperaturesbefore Franklin was negativelyaffected. The monitoring system alsorevealed changes in machine roomairflow following the power upgrade.Repositioning some floor tilesand partitions corrected the airflowand rebalanced air temperatures.In addition to the SynapSensesensors, NERSC’s two Craysystems have their own internalsensors; and sensor readingsfrom two different chiller plantsand air handlers come in on theirown separate systems as well.NERSC staff created severalapplications to bring the Crayand SynapSense data together,and work continues on integratingthe others.Future energy efficiency projectsinclude controlling water flow andfan speeds within the air handlingunits based on heat load, andplacing curtains or “top hats”around systems to optimize theair flows in their vicinity.Improving HPC SystemUsability with ExternalServicesFeedback from NERSC’s Franklinusers has led NERSC and Crayto jointly develop configurationFigure 5. NERSC’s energy monitoring system generates heat maps that can be used to look for areas where efficiency can be improved.
NERSC Center67improvements for both Franklinand Hopper that improve theirusability and availability. Theseimprovements, called externalservices, are now available on allCray XE6 systems, and includeexternal login nodes, reconfigurablehead nodes (also called machineorientedminiserver, or MOM nodes),and an external global filesystem.While NERSC users have made greatuse of the Franklin Cray XT4 system,the large number and variety ofusers, projects, applications, andworkflows running at NERSC exposedseveral issues tied to the fact thatalmost all user-level serial processing(e.g., compiling, running analysis,pre- or post-processing, file transferto or from other systems) is confinedto a small number of shared servicenodes. These nodes have limitedmemory (8 GB), no swap disk, andolder processor technology. NERSCuser applications had often run theFranklin service nodes out of memory,primarily due to long and involvedapplication compilation, analyticsapplications, and file transfers.In the Franklin configuration, itwas hard for NERSC to increasethe number of login nodes asdemand for this service increased.During the contract negotiation forthe Hopper system, NERSC staffworked with Cray to design, configure,and support eight external login nodes,located outside of the Hopper Geminiinterconnect; they have more memory(128 GB), more cores (16), and moredisk than the original Franklin loginnodes. One external login node wasalso added to Franklin itself.Head nodes are the nodes that launchparallel jobs and run the serial partsof these jobs. Initially on Franklinthis function ran on the login nodes,and large interactive jobs could causecongestion and batch launch failures.On Franklin the solution was todivide the servers into a login pooland a head node pool. On Hopper,the hardware for the head nodesis the same as the hardware for thecompute nodes, allowing computenodes to be reconfigured into headnodes and vice versa at boot time.NERSC has initially deployed a higherratio of head nodes to computenodes on Hopper than on Franklin.This configuration provides bettersupport for complex workflows.As a result, NERSC’s “power login”users now have the capabilitiesthey need, and they are no longerdisruptive to other users.In addition to external login nodes,login node balancers have improvedinteractive access to NERSCsystems. Assigning login nodesto incoming interactive sessionsin the traditional round-robinfashion can lead to congestion,or worse, the perception that anentire system is down when theuser lands on a bad login nodeor one that has been taken offline.Load balancers now interceptincoming sessions and distributethem to the least-loaded activenode. Also, staff can now take downnodes at will for maintenance,and the load balancer automaticallyremoves out-of-service nodes fromthe rotation and redirects usersattempting to connect to a cached(but offline) node to an activeone. As a result, users no longerconfuse a downed node witha downed system.While load balancers are commonlyused for web and databaseapplications (where sessions tendto be extremely short and the amountof data transferred quite small),these devices have not typicallybeen used in a scientific computingcontext (where connections arelong-lived and data transfers large).To take advantage of this technology,NERSC staff worked closely withvendor Brocade. Together theyconceived, adapted, and implementedthree improvements to support HPCenvironments: (1) support for 10 GB/sconnections, needed for HPC I/O;(2) support for jumbo frames, neededfor large data sets; and (3) supportfor long-lasting login sessions.Brocade has integrated theseinnovations into their product line,making them available to otherHPC centers.Availability of files has also beenimproved. When Franklin was firstacquired, it could not be integratedinto the NERSC Global Filesystem(NGF), which uses GPFS software,because Franklin’s I/O nodes onlyran the Lustre filesystem software.As a result, files on Franklin werenot accessible from other NERSCsystems. So NERSC worked withCray to test, scale, and tune theData Virtualization Services (DVS)software that Cray had purchasedand developed for the solution, andin 2010 the DVS software went intoproduction. Now NGF hosts a globalproject filesystem that allow usersto share files between systemsand among diverse collaborators.In addition, NERSC introduced aGPFS-based global home filesystemthat provides a convenient way forusers to access source files, inputfiles, configuration files, etc.,regardless of the platform the useris logged onto; and a global scratchfilesystem that can be used totemporarily store large amountsof data and access it from anysystem. When user applicationsload one of these dynamic sharedlibraries, the compute nodes useDVS to access and cache thelibraries or data.
68 NERSC Annual Report 2010-2011Taken together, the benefit of theseexternal services (Figure 6) isimproved usability and availability:users do not have to worry aboutwhether Hopper is up or down,because filesystem and login nodesand software are outside the mainsystem. So, users can log in, accessfiles, compile jobs, submit jobs,and analyze data even if the mainsystem is down.Providing NewComputational ModelsLustre router onI/O nodes routes trafficto external systemsXGemini Interconnect8 Login NodesMagellan Cloud ComputingTestbed and EvaluationCloud computing can be attractiveto scientists because it can givethem on-demand access to computeresources and lower wait times thanthe batch-scheduled, high-utilizationsystems at HPC centers. Clouds canalso offer custom-built environmentssuch as the software stack used withtheir scientific code. And some cloudshave implemented data programmingmodels for data intensive computingsuch as MapReduce.With American Recovery andReinvestment Act (ARRA) funding,NERSC has deployed the Magellanscience cloud testbed for the DOEOffice of Science. It leveragesNERSC experience in provisioninga specialized cluster (PDSF) for thehigh energy and nuclear physicscommunity. NERSC is focused onunderstanding what applicationsmap well to existing and emergingcommercial solutions, and whataspects of cloud computing are bestprovided at HPC centers like NERSC.To date we have demonstratedon-demand access to cycles forthe DOE Joint Genome Institute(JGI) with a configuration usinga 9 GB network provided by ESnet/home filesystemFigure 6. Schematic of external nodes on Hopper.and 120 nodes from the Magellancluster. This arrangement allowsbiological data to remain at JGI eventhough the compute servers are atNERSC. One early success on thiscluster comes from a JGI teamusing Hadoop to remove errors from5 billion reads of next-generationDNA sequence data.We have also deployed a MapReducecluster running Hadoop, Apache’sopen-source implementation ofMapReduce, which includes botha framework for running MapReducejobs as well as the Hadoop DistributedFile System (HDFS), which providesa locality aware, fault tolerant filesystem to support MapReduce jobs.NERSC is working with targetedcommunities such as bioinfomaticsto investigate how applications likeInfiniBandGPFS Server/project filesystemInfiniBandLustre Server/scratch filesystemBLAST perform inside the MapReduceframework. Other users are using theHadoop cluster to perform their ownevaluations, including implementingalgorithms from scratch in theprogramming model. Theseexperiences will provide insight intohow to best take advantage of this newapproach to distributed computing.The first published performanceanalysis research from Magellancollaborators shows that cloudcomputing will not work for scienceunless the cloud is optimized forit. The paper “Performance Analysisof High Performance ComputingApplications on the Amazon WebServices Cloud,” 3 written by a teamof researchers from Berkeley Lab’sComputational Research Division,Information Technology Division,
NERSC Center69EC2 time / Magellan time2520151050Slowdown on Commercial CloudGAMESSGTCIMPACTFigure 7. On traditional science workloads, standard cloud configurations see significantslowdown (up to 52x), but independent BLAST jobs run well.and NERSC, was honored with the BestPaper Award at the IEEE’s InternationalConference on Cloud ComputingTechnology and Science (CloudCom2010), held November 30–December1 in Bloomington, Indiana.After running a series of benchmarksdesigned to represent a typicalmidrange scientific workload—applications that use less than1,000 cores—on Amazon’s EC2system, the researchers found thatthe EC2’s interconnect severelylimits performance and causessignificant variability. Overall, thecloud ran six times slower than atypical midrange Linux cluster, and20 times slower than a modern highperformance computing system.“We saw that the communicationpattern of the application canimpact performance,” said KeithJackson, a computer scientist inBerkeley Lab’s ComputationalResearch Division (CRD) and leadfvCAMMAESTRO..MILC52xPARATECRelative toFranklinBLASTauthor of the paper. “Applicationslike Paratec with significant globalcommunication perform relativelyworse than those with less globalcommunication.” He noted thatthe EC2 cloud performance variedsignificantly for scientific applicationsbecause of the shared nature of thevirtualized environment, the network,and differences in the underlyingnon-virtualized hardware.A similar comparison of performanceon the standard EC2 environmentversus an optimized environmenton the Magellan testbed showeda slowdown of up to 52x on EC2for most applications, although theBLAST bioinformatics code ran well(Figure 7). Another key issue is cost:EC2 costs 20 cents per CPU hour,while Magellan can run workloadsat less than 2 cents per CPU hour,since it is run by a nonprofitgovernment lab.The Magellan Project was honoredwith the 2010 HPCwire Readers’Choice Award for “Best Use of HPCin the Cloud.”Developing Best Practicesin Multicore ProgrammingThe motivation for studying newmulticore programming methods stemsfrom concerns expressed by the DOEOffice of Science users about howto rewrite their applications to keepahead of the exponentially increasingHPC system parallelism. In particular,the increase in intra-node parallelismof the Hopper Phase 2 system presentschallenges to existing applicationimplementations—and this is justthe first step in a long-term technologytrend. Applications and algorithmswill need to rely increasingly onfine-grained parallelism, strongscaling, and improved support forfault resilience.NERSC has worked with Cray toestablish a Cray Center of Excellence(COE) to investigate these issues,to identify the correct programmingmodel to express fine-grainedparallelism, and to guide thetransition of the user communityby preparing training materials.The long-term goals are to evaluateadvanced programming models andidentify a durable approach forprogramming on the path to exascale.MPI (Message Passing Interface),which has been the dominant HPCprogramming model for nearly twodecades, does not scale well beyondeight cores. The hybrid OpenMP/MPImodel is the most mature approachto multicore programming, and itworks well on Hopper; but gettingthe best performance depends on3K. R. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf, H. J. Wasserman, and N. J. Wright, “Performance Analysis of High PerformanceComputing Applications on the Amazon Web Services Cloud,” 2010 IEEE Second International Conference on Cloud Computing Technology and Science(CloudCom), pp.159–168, Nov. 30, 2010–Dec. 3, 2010, doi: 10.1109/CloudCom.2010.69.
70 NERSC Annual Report 2010-2011customizing the implementationfor each application.While MPI is a collection ofprocesses with no shared data,OpenMP is a collection of threadswith both private and shared data;so OpenMP can save a significantamount of memory compared withMPI. The key question is how manyOpenMP threads to put on a nodevs MPI processes: One MPI processon each core? Or one MPI processon the whole node, and within thenode use OpenMP thread parallelism?The running time for each optiondepends on the application: somerun slower, some faster, some runfaster for a while and then get slower.So the COE is developing guidelinesfor finding the “sweet spot” of howmuch OpenMP parallelism to addto various applications (Figure 8).Computational Science andEngineering Petascale InitiativeTo ensure that science effectivelyadapts to multicore computing,NERSC is receiving $3 millionfrom the American Recovery andReinvestment Act to develop theComputational Science andEngineering Petascale Initiative.This initiative identifies keyapplication areas with specific needsfor advanced programming models,algorithms, and other support.The applications are chosen to beconsistent with the current missionof the Department of Energy, witha particular focus on applicationsthat benefit energy research, thosesupported by other Recovery Actfunding, and Energy Frontier ResearchCenters (EFRCs). The initiative pairspostdoctoral researchers (Figure 9)with these high-impact projects atNERSC. Here are the projects theyare working on:• Brian Austin is working withTime / s200018001600140012001000800600400200001768FFT “DGEMM” MPI238432566128OpenMP threads / MPI tasks1264high-energy physicists to createcomputational models that willrefine the design of Berkeley Lab’sproposed soft x-ray, free electronlaser user facility called the NextGeneration Light Source.• Kirsten Fagnan is working withresearchers in Berkeley Lab’sCenter for Computational Sciencesand Engineering (CCSE) to developan adaptive mesh refinementcapability for numerical simulationof carbon sequestration and porousmedia flow.• Jihan Kim is helping chemistsfrom the University of California,Berkeley accelerate theirchemistry codes on NERSCsupercomputers.• Wangyi (Bobby) Liu (startingin 2011) is working with theNeutralized Drift CompressionExperiment (NDCX) groupat Berkeley Lab to developa surface tension model withinthe current warm dense mattersimulation framework—new toolsfor understanding inertialconfinement fusion.GOOD• Filipe Maia is working withresearchers at Berkeley Lab’sAdvanced Light Source toaccelerate biological imagingcodes and with the Earth SciencesDivision to accelerate geophysicalTime / s7006005004003002001000Dynamics Physics OpenMP MPI124021203806401220OpenMP threads / MPI tasksFigure 8. MPI + OpenMP performance on two of NERSC’s benchmark applications: (a)Paratec and (b) fvCAM.imaging codes with graphicsprocessing units (GPUs).GOOD• Praveen Narayanan is workingwith CCSE researchers to runnumerical experiments using highperformance combustion solvers,and also working with tools toanalyze performance of HPC codesslated for codesign at extremelyhigh concurrency.• Xuefei (Rebecca) Yuan (startingin 2011) is working withresearchers in Berkeley Lab’sComputational Research Division(CRD) to improve a hybrid linearsoftware package and withphysicists at Princeton PlasmaPhysics Laboratory (PPPL) toaccelerate magnetohydrodynamiccodes with GPUs.• Robert Preissl works on advancedprogramming models for multicoresystems and is applying thisto physics codes from PPPLto optimize and analyze parallelmagnetic fusion simulations.Two of the “petascale postdocs,”Robert Preissl and Jihan Kim, alongwith project leader Alice Koniges andcollaborators from other institutions,were co-authors of the Best Paperwinner at the Cray User Groupmeeting in Edinburgh, Scotland,in May 2010. 4 Titled “ApplicationAcceleration on Current and Future
NERSC Center71Cray Platforms,” the paper describedcurrent bottlenecks and performanceimprovement areas for applicationsincluding plasma physics, chemistryrelated to carbon capture andsequestration, and material science.The paper discussed a variety ofmethods including advanced hybridparallelization using multi-threadedMPI, GPU acceleration, andautoparallelization compilers.they contain massive numbers ofsimple processors, which are moreenergy-efficient than a smallernumber of larger processors. Theyalso are available at reasonable cost,since they are already being massproducedfor video gaming.The question is whether GPUs offeran effective solution for a broadscientific workload or for a morelimited class of computations.(except for two cores that have fourGPUs each). The system includesa complete set of development toolsfor hybrid computing, includingCUDA, OpenCL, and PGI GPGPUtargetedcompilers.The Dirac testbed provides theopportunity to engage with theNERSC user community to answerthe following questions:GPU Testbed and EvaluationGraphics processing units (GPUs)are becoming more widely used forcomputational science applications,and GPU testbeds have beenrequested at the requirementsworkshops NERSC has beenconducting with its users. GPUscan offer energy-efficient performanceboosts to traditional processors, sinceIn April 2010 NERSC fieldeda 50-node GPU cluster calledDirac in collaboration with theComputational Research Divisionat Berkeley Lab, with funding fromthe DOE ASCR Computer ScienceResearch Testbeds program. Eachnode contains two InfiniBandconnectedIntel Nehalem quad-corechips and one NVIDIA Fermi GPU• What parts of the NERSCworkload will benefit fromGPU acceleration?• What portions of the workloadsee no benefit, or insufficientbenefit to justify the investment?• Do GPUs represent the futureof HPC platforms, or a featurethat will be beneficial to a subsetof the community?Figure 9. Alice Koniges (third from left) of NERSC’s Advanced Technologies Group leadsthe Computational Science and Engineering Petascale Initiative, which pairs postdoctoralresearchers with high-impact projects at NERSC. The postdocs are (from left) Jihan Kim,Filipe Maia, Robert Preissl, Brian Austin (back), Wangyi (Bobby) Liu (front), KirstenFagnan, and Praveen Narayanan. Not shown: Xuefei (Rebecca) Yuan. Photo: RoyKaltschmidt, Berkeley Lab Public Affairs.Porting an application to a GPUrequires reworking the code, suchas deciding which parts to run ona GPU and then figuring out how tobest modify the code. For example,running an application efficientlyon a GPU often requires keepingthe data near the device to reducethe computing time taken up withmoving data from the CPU ormemory to the GPU. Also, aprogrammer must decide how tothread results from the GPU backinto the CPU program. This is not atrivial process, so the Dirac web siteprovides users with some tips forporting applications to this system.Over 100 NERSC users haverequested and been given accessto the Dirac cluster. With their help,we have collected a broad range of4Alice Koniges, Robert Preissl, Jihan Kim, David Eder, Aaron Fisher, Nathan Masters, Velimir Mlaker, Stephane Ethier, Weixing Wang, Martin Head-Gordon, andNathan Wichmann, “Application Acceleration on Current and Future Cray Platforms,” CUG 2010, the Cray User Group meeting, Edinburgh, Scotland, May 2010.
72 NERSC Annual Report 2010-2011preliminary performance data thatsupport the idea that GPUs in theircurrent configuration will be animportant way to augment theeffectiveness of NERSC systemsfor a subset of the user base.Perhaps the largest collaborationusing Dirac is the InternationalCenter for Computational Science(ICCS), a collaboration betweenBerkeley Lab, the Universityof California–Berkeley, NERSC,the University of Heidelberg, theNational Astronomical Observatoriesof the Chinese Academy of Science,and Nagasaki University. ICCSexplores emerging hardware devices,programming models, and algorithmtechniques to develop effective andoptimal tools for enabling scientificdiscovery and growth.ICCS researchers have used Diracand other GPU clusters for rigoroustesting of an adaptive meshrefinement code called GAMERused for astrophysics and otherscience domains, and the N-bodycode phi-GPU, which is used tosimulate dense star clusters withmany binaries and galactic nucleiwith supermassive black holes,in which correlations betweendistant particles cannot beneglected. The results playeda significant role in developingmetrics for workloads and respectivespeedups; and the phi-GPU teamwon the 2011 PRACE (Partnershipfor Advanced Computing in Europe)Award for the paper “AstrophysicalParticle Simulations with LargeCustom GPU Clusters on ThreeWall Time (Hours)18.104.22.168.60GLYCINE-8x 12.3Step 4Blue: CPU 1 thread / Red: CPU 8 threads / Green: CPUFigure 10. Q-Chem CPU/GPU performance comparisons for two proteins. CPU multithreadperformance is similar to GPU performance for smaller glycine molecules.Continents.” 5 In this paper, theypresent direct astrophysical N-bodysimulations with up to 6 millionbodies using the parallel MPI-CUDAcode phi-GPU on large GPUclustersin Beijing, Berkeley, and Heidelberg,with different kinds of GPUhardware. They reached aboutone-third of the peak performancefor this code in a real applicationscenario with hierarchically blockedtimesteps and a core-halo densitystructure of a stellar system.Other research results fromDirac include:x 1.7Total RI-MP2Time• NERSC postdoc Jihan Kimhas found impressive GPUversus single-node CPUspeedups for the Q-Chemapplication, but performancevaries significantly with theinput structure (Figure 10).• A study by researchers fromBerkeley Lab’s ComputationalResearch Division, NERSC,UC Berkeley, and PrincetonPlasma Physics Laboratorydemonstrated the challengesWall Time (Hours)1201008060402000GLYCINE-20x 18.8Step 4of achieving high GPUperformance for key kernels froma gyrokinetic fusion applicationdue to irregular data access,fine-grained data hazards, andhigh memory requirements. 6• Researchers from ArgonneNational Laboratory, testingcoupled cluster equations forelectronic correlation in small-tomedium-sized molecules, foundthat the GPU-accelerated algorithmreadily achieves a factor of 4 to5 speedup relative to the multithreadedCPU algorithm onsame-generation hardware. 7Improving UserProductivityx 7.4Total RI-MP2TimeWith over 400 projects and 700applications, NERSC is constantlylooking for ways to improve theproductivity of all 4000 scientificusers. NERSC closely collaborateswith scientific users to improve theirproductivity and the performanceof their applications. Additionally,because of NERSC’s large number5R. Spurzem, P. Berczik, T. Hamada, K. Nitadori, G. Marcus, A. Kugel, R. Manner, I. Berentzen, J. Fiestas, R. Banerjee, and R. Klessen, “Astrophysical ParticleSimulations with Large Custom GPU Clusters on Three Continents,” in press.6Kamesh Madduri, Eun-Jin Im, Khaled Ibrahim, Samuel Williams, Stéphane Ethier, and Leonid Oliker, “Gyrokinetic particle-in-cell optimization on emergingmulti- and manycore platforms,” Parallel Computing Journal, in press.7A. E. DePrince and J. R. Hammond, “Coupled Cluster Theory on Graphics Processing Units I. The Coupled Cluster Doubles Method,” Journal of ChemicalTheory and Computation, in press.
74 NERSC Annual Report 2010-2011Figure 11. Four example detection triplets from PTF, similar to those uploaded to Galaxy Zoo Supernovae. Each image is 100 arcsecondson a side. In each triplet, the panels show (from left to right) the most recent image containing the candidate supernova light (the scienceimage), the reference image generated from data from an earlier epoch with no supernova candidate light, and the subtraction or differenceimage—the science image minus the reference image—with the supernova candidate at the centre of the crosshairs. The two triplets onthe left are real supernovae, and were highly scored by the Zoo; the triplets on the right are not supernovae and were given the lowestpossible score. Image: A. M. Smith et al., 2011.The PTF, which started operating in2008, is a consortium of universitiesand telescope sites that image thesky every night. The resulting filesare sent to NERSC, where softwarerunning on the Carver systemcompares each night’s images tocomposites gathered over the lastdecade, a process called subtraction.Images in which stars appear to havemoved or changed brightness areflagged for human follow-up. Somefollow-up is done by scientistsbut recently the PTF also enlistedvolunteers through Galaxyzoo.orgto sort stray airplane lights fromsupernovae. Once a good candidateis identified, one telescope in theconsortium is assigned to follow up.For example, one group of astronomershas embarked on an ambitious Type1a supernova search using the PTFto search for supernovae in nearbygalaxies along with the Hubble SpaceTelescope. In order to get the Hubbleto observe a “target of opportunity,”in this case a nearby supernova burningat peak brightness, astronomers mustfind the exploding star, do follow-upobservations to confirm that they areindeed looking at a supernova, thenalert the Hubble—all within the firstseven or eight days of onset. The PTFsent 30 such target of opportunitytriggers to Hubble for that program,and it inspired a new program in whichHubble observes PTF supernovatargets starting more than two weeksbefore peak brightness and followsthem for a month.In a proof of concept for the GalaxyZoo Supernovae project, from Aprilto July 2010, nearly 14,000supernova candidates from PTFwere classified by more than 2,500volunteers within a few hours of datacollection (Figure 11). The citizenscientists successfully identifiedas transients 93% of the ~130spectroscopically confirmedsupernovae that PTF located duringthe trial period, with no false positiveidentifications. This trial run showsthat the Galaxy Zoo Supernovaemodel can in principle be appliedto any future imaging survey, andthe resulting data can also be usedto improve the accuracy ofautomated machine-learningtransient classifiers. 8Speeding the PTF workflow witha high-speed link and running thedatabase on the NERSC ScienceGateways in parallel with the Carversubtraction pipeline has enabledthe PTF to become the first surveyto target astrophysical transients inreal time on a massive scale. In itsfirst two and a half years, PTF hasdiscovered over 1,100 supernovae.NERSC and JGI Join Forcesfor Genomics ComputingA torrent of data has been flowingfrom the advanced sequencing8A. M. Smith et al., “Galaxy Zoo Supernovae,” Monthly Notices of the Royal Astronomical Society 412, 1309 (2011).
NERSC Center75platforms at the Department ofEnergy Joint Genome Institute(DOE JGI), among the world’s leadinggenerators of DNA sequenceinformation for bioenergy andenvironmental applications. In2010, JGI generated more than5 terabases (5 trillion nucleotideletters of genetic code) from itsplant, fungal, microbial, andmetagenome (microbial community)user programs—five times moreinformation than JGI generated in2009 (Figure 12). On December 14,2010, the Energy Sciences Network(ESnet) tweeted that JGI had justsent more than 50 terabits ofgenomics data in the past 10 hoursat the rate of nearly 10 gigabits persecond. JGI is now the largest singleuser of ESnet, surpassing even theLarge Hadron Collider.To ensure that there is a robustcomputational infrastructure formanaging, storing, and gleaningscientific insights from this evergrowingflood of data, JGI has joinedDNA SEQUENCINGPLATFORMILLUMA454SANGER1003.9 Gb = 1Tb5676.06812.73360.60832.74 39.22 41.17 6420.3 20.584.128170.58FY 06 FY 07 FY 08 FY 09 FY 10Figure 12. DOE JGI sequence output inbillions of nucleotide bases (gigabases,Gb), 2006–2010.forces with NERSC. Genomicscomputing systems are now splitbetween JGI’s campus in WalnutCreek, California, and NERSC’sOakland Scientific Facility, which are20 miles apart. NERSC also managesJGI’s six-person systems staff tointegrate, operate, and support thesesystems. NERSC performs this workon a cost-recovery basis similar tothe way it manages and supports theParallel Distributed Systems Facility(PDSF) for high energy and nuclearphysics research.“We evaluated a wide varietyof options, and after a thoroughreview, it made perfect sense topartner with NERSC,” said VitoMangiardi, JGI’s Deputy Director forBusiness Operations and Production.“With a successful track record inproviding these kinds of services,they don’t have the steep learningcurve to climb.”“This is a great partnership,because data-centric computingis an important future direction forNERSC, and genomics is seeingexponential increases in computationand storage,” said NERSC DivisionDirector Kathy Yelick. “The computingrequirements for the genomicscommunity are quite different fromNERSC’s more traditional workload.The science is heavy in data analytics,and JGI runs web portals providingaccess to genomic information.”“These are critically importantassets that we can now bring tobear on some of the most complexquestions in biology,” said JGIDirector Eddy Rubin. “We reallyneed the massive computation‘horsepower’ and supportinginfrastructure that NERSC offersto help us advance our understandingof the carbon cycle and many of theother biogeochemical processes inwhich microbes play a starring roleand that the JGI is characterizing.”Data is currently flowing betweenJGI and NERSC over ESnet’sScience Data Network (SDN).The SDN provides circuit-orientedservices to enable the rapidmovement of massive scientificdata sets, and to support innovativecomputing architectures such asthose being deployed at NERSCin support of JGI science.In March 2010, JGI became anearly user of NERSC’s MagellanCloud Computing System whenthe Institute had a sudden needfor increased computing resources.In less than three days, NERSC andJGI staff provisioned and configuredhundreds of processor cores onMagellan to match the computingenvironment available on JGI’s localcompute clusters. At the same time,staff at both centers collaboratedwith ESnet network engineersto deploy a dedicated 9 gigabitper second virtual circuit betweenMagellan and JGI over SDNwithin 24 hours.This strategy gives JGI researchersaround the world increasedcomputational capacity withoutany change to their software orworkflow. JGI users still log onto the Institute’s network andsubmit scientific computing jobsto its batch queues, managed byhardware located in Walnut Creek.Once the jobs reach the frontof the queue, the informationtravels 20 miles on reservedSDN bandwidth, directly toNERSC’s Magellan system inOakland. After a job has finished,the results are sent back toWalnut Creek on the SDN withinmilliseconds to be saved onfilesystems at JGI.
76 NERSC Annual Report 2010-2011“What makes this use of cloudcomputing so attractive is thatJGI users do not notice a differencebetween computing on Magellan,which is 20 miles away at NERSC,or on JGI’s computing environmentin Walnut Creek,” says JeffBroughton, who heads NERSC’sSystems Department.Improving VisIt Performance forAMR ApplicationsApplications using adaptive meshrefinement (AMR) are an importantcomponent of the DOE scientificworkload. VisIt, a 3D visualizationtool, is widely used in data analysisand feature rendering of AMR data.5000 kmWhen Tomasz Plewa of Florida StateUniversity, a user of the FLASH AMRcode, reported to Hank Childs of theNERSC Analytics Group that VisItwas performing very slowly, Childsfound that VisIt’s representations ofAMR data structures were not welldesigned for very large numbers ofpatches. Childs spent several weeksadding new infrastructure to VisIt torepresent AMR data more efficiently.He then modified VisIt’s FLASHreader (in consultation with theVisIt-FLASH team at ArgonneNational Laboratory) to use thisnew infrastructure. As a result, VisItbrought up a plot of Plewa’s datain only 13 seconds—40 times fasterthan the 8 minutes, 38 secondspreviously required. The newperformance time is reasonableconsidering the size of the datawith respect to I/O performance.The AMR tuning proved beneficialto other NERSC users as well. HaitaoMa and Stan Woosley of the Universityof California, Santa Cruz, and theSciDAC Computational AstrophysicsConsortium called on Childs to createan animation for the annual SciDACconference’s Visualization NightFigure 13. Flame front image from the video “Type 1a Supernova: Turbulent Combustionat the Grandest Scale” by Haitao Ma and Stan Woosley of UC Santa Cruz with John Bell,Ann Almgren, Andy Nonaka, and Hank Childs of Berkeley Lab.(“Vis Night”). Their AMR data alsohad a large number of patches andthus was running slowly in VisIt.To address this, Childs modified theBoxLib reader to use the new AMRinfrastructure, which allowed thedata to be processed much morequickly. The movie was finished intime for the Vis Night contest, whereit ultimately received one of thePeople’s Choice awards (Figure 13).All of the changes (AMR infrastructure,FLASH, and BoxLib readers) havebeen committed to the VisIt sourcecode repository and are now availableto the entire VisIt user community.Optimizing HPSS TransfersRandall Hamper from IndianaUniversity contacted NERSC becausehe needed to access data from theHPSS storage system and transferit to his home university. Hampercollaborates with the Nearby SupernovaFactory project sponsored by theOffice of High Energy Physics, andhe needed to access 1.6 million smallfiles. A number of the files were storedas one group on the storage system,but many others were scatteredaround on various storage tapes.Creating a simple list of files andattempting to access the files in thelisted order resulted in the same tapemedia having to be mounted morethan once. Because mounting the tapeis the slowest part of retrieving a smallfile, possibly taking a minute or two,file access times crawled to a nearhalt. These numerous small requestsalso created contention on the systemfor other NERSC users attemptingto access their own datasets.
NERSC Center77Members of the HPSS storage teamcreated a script that queries theHPSS metadata to determine whichtape volume it belongs to. Then thefiles were grouped according to tapevolume. This change improved dataretrievals from HPSS by ordersof magnitude. Additionally, thiscollaboration created a reusabletechnology, the script for groupingfiles. It has been distributed toa number of different users to helpthem retrieve data more optimally.Uncovering Job Failures at ScaleDebugging applications at scale isessentially detective work. The issuesonly occur over a certain concurrencyand are often tightly coupled to asystem’s architecture. Wexing Wangfrom the Princeton Plasma PhysicsLaboratory contacted NERSCconsultants because his applicationwas failing more than half the timewhen he ran on more than 8,000processor cores. The cause of hisproblem was not obvious. Thesimulation did not fail in the samepart of the code consistently.Sometimes the simulation wouldrun for two hours before failing; othertimes it ran for 18 hours, though itrarely ran to completion successfully.Furthermore, the application code,GTS, had been known to scale to verylarge concurrencies on other systems.NERSC put together a team ofconsultants, systems staff, and Craystaff to troubleshoot the issue. Aftermuch detective work—searchingfor unhealthy nodes in the system,examining the code’s communicationpattern and memory use—the I/Owas analyzed, and it was discoveredthat the application was not doingI/O optimally. It was using a one-fileper-processormodel to write out20–30 MB of data per core on over8,000 processors, roughly 240 GB perdump. A Lustre striping parameterwas set to 32 disks, and with the userwriting output frequently, it meant 7.6million file segments were beingcreated every 4 minutes. This largenumber of I/O transactions was floodingthe system and causing the job to fail.Once the problem was identified,the solution was straightforward.We recommended decreasing thestriping parameter and reducingthe restart frequency to every fourhours. Smaller analysis output filescontinued to be produced every fourminutes. Tests showed that makingthese two changes allowed the codeto run successfully to completion.Improving Network Transfer RatesA user working with the Planckproject, an international consortiumstudying the cosmic microwavebackground radiation, contactedthe consulting group because he wasexperiencing slow data transfer ratesbetween NERSC and the LeibnizSupercomputing Centre (LRZ) inGermany. Using the gridFTP tool,he only saw rates of about 5 MB/second, far too slow to transfer largeobservational astronomy imagesin a reasonable amount of time.MB/sec1201008060402006Original RateTCP/gridFTPBuffer SettingsNERSC made recommendationsto increase his TCP and gridFTPbuffer settings, which increasedperformance from 5 to 30 MB/sec.Next, a NERSC networking engineertraced a packet from NERSC toLRZ and noticed that a router alongthe way was not enabled for jumboframes, creating a bottleneck alongthe path between the two sites.Finding the administrator responsiblefor a specific router in Germanyis not easy. However, with the helpof staff from the LRZ, the administratorwas found, and jumbo frames wereenabled on the target router. Thisincreased the transfer rate to 65MB/sec for the Planck project user.Catching this misconfigured router willnot only improve performance transferrates between NERSC and LRZ for allusers, but will improve performancefor other DOE collaboratorstransferring data along this link.Finally, NERSC gave advice to LRZstaff on reconfiguring their firewall.This last improvement increasedtransfer rate performance to 110MB/sec, for an overall improvementof over 18 times the original speed,as shown in Figure 14.Enable JumboFrames on RouterFigure 14. Transfer rates between NERSC and LRZ were boosted 18-fold.3065110ReconfigureFirewall
78 NERSC Annual Report 2010-2011Software SupportNERSC pre-compiles and supportsdozens of software packages for ourusers, comprising over 13.5 millionlines of source code. These softwarepackages include debugging tools,analysis and post-processingsoftware, and third-party applicationsthat scientists use to run their primarysimulations. Instead of each userinstalling his or her own versionof a third-party software package,NERSC installs and supportsoptimized versions of many applications,allowing users to focus ontheir science research rather than oninstalling and supporting software.In the past few years, NERSC hasseen an increase in the number ofscientists (currently 400) using thesethird-party applications. This isprimarily due to the relative increasein the proportion of materials scienceand chemistry time allocated toNERSC projects and the popularityof third-party applications in thesescience areas (Tables 1 and 2).Maintaining User SatisfactionThe annual NERSC user surveyresponses provide feedback aboutevery aspect of NERSC’s operation,help us judge the quality of ourservices, give DOE information onhow well NERSC is doing, and pointus to areas we can improve. Threehundred ninety-five users respondedto the 2009/2010 survey, representing11 percent of authorized users and67 percent of total MPP usage. Therespondents represent all six DOEscience offices and a variety ofhome institutions. On a long listof questions, users rated us on a7-point satisfaction scale (Table 3).On the open-ended question “Whatdoes NERSC do well?” 132 usersresponded. Some representativecomments are:User support is fantastic—timelyand knowledgeable, includingfollow-up service. New machinesare installed often, and they arestate-of-the-art. The queues arecrowded but fairly managed.Website is first class, especiallythe clear instructions forcompiling and running jobs.The account allocation processis very fast and efficient.NERSC has proven extremelyeffective for running highresolution models of the earth’sclimate that require a large numberof processors. Without NERSCI never would have been able torun these simulations at such ahigh resolution to predict futureclimate. Many thanks.On the question “What can NERSCdo to make you more productive?”105 users responded. The top areasof concern were long queueturnaround times, the need for morecomputing resources, queue policies,and software support. Some of thecomments from this section are:There are a lot of users (it is goodto be useful and popular), andthe price of that success is longqueues that lead to slowerturn-around. A long-standingproblem with no easy answer.Allow longer jobs (such as one monthor half year) on Carver and Hopper.Let science run to the course.Codes Hopper Franklin CarverABINIT • • •CP2K • • •CPMDQuantumEspresso•• • •LAMMPS • • •QboxTable 1Pre-Compiled Materials ScienceCodes at NERSC•SIESTA • • •VASP • • •WEIN 2K•Table 2Pre-Compiled ChemistryCodes at NERSCCodes Hopper Franklin CarverAMBER • •G09GAMESS • • •GROMACS • • •MOLPRO • • •NAMD • • •NWChem • • •Q-Chem • ••Provide an interface which canhelp the user determine whichof the NERSC machines is moreappropriate at a given timefor running a job based onthe number of processors andruntime that are requested.I could possibly use some moreweb-based tutorials on varioustopics: MPI programming, dataanalysis with NERSC tools,a tutorial on getting VisIt(visualization tool) to workon my Linux machine.Every year we institute changes basedon the previous year’s survey. In 2010
NERSC Center79NERSC took a number of actionsin response to suggestions from the2008/2009 user survey, includingimprovements to the Franklinsystem and the NERSC web site.On the 2008/2009 survey, Franklinuptime received the second lowestaverage score (4.91). In the firsthalf of 2009, Franklin underwentan intensive stabilization period.Tiger teams were formed with closecollaborations with Cray to addresssystem instability. These effortswere continued in the second halfof 2009 and throughout 2010,when NERSC engaged in a projectto understand system-initiatedcauses of hung jobs, and toimplement corrective actionsto reduce their number. Theseinvestigations revealed bugs in theSeastar interconnect as well as inthe Lustre file system. These bugswere reported to Cray and werefixed in March 2010, when Franklinwas upgraded to Cray LinuxEnvironment 2.2.i. As a result,Franklin’s mean time betweenfailures improved from a low ofabout 3 days in 2008 to 9 daysin 2010.On the 2010 survey, Franklinuptime received an average scoreTable 32009/2010 User Survey Satisfaction ScaleSatisfaction Score Meaning Number of Times Selected7 Very Satisfied 8,0536 Mostly Satisfied 6,2195 Somewhat Satisfied 1,4884 Neutral 1,0323 Somewhat Dissatisfied 3662 Mostly Dissatisfied 1001 Very Dissatisfied 88of 5.99, a significant increase of1.08 points over the previous year.Two other Franklin scores (overallsatisfaction, and disk configurationand I/O per formance) weresignificantly improved as well.Another indication of increasedsatisfaction with Franklin is thaton the 2009 survey 40 usersrequested improvements inFranklin uptime or performance,whereas only 10 made suchrequests on the 2010 survey.On the 2008/2009 survey, tenusers requested improvementsto the NERSC web site. In response,User Services staff removed olderdocumentation and made sure thatthe remaining documentation wasup to date. On the 2010 survey,the score for “ease of findinginformation on the NERSC website” was significantly improved.Also, for the medium-scale MPPusers, the scores for the web siteoverall and for the accuracy ofinformation on the web showedsignificant improvement.The complete survey results areavailable at https://www.nersc.gov/news-publications/publications-reports/user-surveys/2009-2010-user-survey-results/.Leadership ChangesIn September 2010, AssociateLaboratory Director for ComputingSciences (and former NERSCDivision Director) Horst Simon wasnamed Deputy Director of BerkeleyLab. “Horst is a strong leader whohas helped to lead a tremendouslyproductive program in highperformance computing that is worldclass,”said Berkeley Lab DirectorPaul Alivisatos. “As Deputy Directorhe’ll help me lead major scientificinitiatives, oversee strategic researchinvestments, and maintain theintellectual vitality of Berkeley Lab.”A few days later, Berkeley Labannounced that Kathy Yelick wasnamed Associate Lab Director forComputing Sciences. Yelick has beenthe director of the NERSC Divisionsince 2008, a position she willcontinue to hold. “I am very pleasedto see Kathy Yelick assume thiscritical role in senior leadership asAssociate Lab Director for ComputingSciences,” said Berkeley LabDirector Paul Alivisatos. “We willbenefit immensely from herknowledge and vision. She will havetremendous opportunity to useBerkeley Lab’s strengths in scientificcomputing and advanced networkingto accelerate research that addressessome of our nation’s most urgentscientific challenges.”In August 2010 Jonathan Carter,who had led NERSC’s User ServicesGroup since 2005, was named thenew Computing Sciences Deputy,succeeding Michael Banda, whojoined Berkeley Lab’s Advanced LightSource. Carter was one of the firstnew employees hired when NERSCmoved to Berkeley Lab in 1996, andhad most recently led the NERSC-6procurement team that selected theCray XE6 Hopper system.
80 NERSC Annual Report 2010-2011Succeeding Carter as leader of theUser Services Group was KatieAntypas, whom Kathy Yelick describesas “a passionate advocate for theNERSC users.” Antypas was amember of the NERSC-6 procurementteam, co-lead ofthe NERSC-6implementation team, and co-leadon a number of Franklin stabilization“tiger teams,” as well as participatingin various planning activities.Energy efficiency has become amajor issue for NERSC, with costsgrowing with each new system andthe need for facility upgrades likelyevery few years. This requiresincreased monitoring, reporting, andoptimization of energy use, as well asmanagement of the associated risks.Rather than distributing themanagement of these activitiesacross multiple individuals, NERSCDirector Kathy Yelick created a newposition of Energy Efficiency and RiskManager and named Jim Crawto this new role. “Jim’s extensiveknowledge of NERSC’s systems,operations and vendor relationswill be invaluable to supportthe expanded emphasis on thesematters,” Yelick said. Craw had beenleader of NERSC’s ComputationalSystems Group; Systems DepartmentHead Jeff Broughton is acting leaderof the group until a permanentreplacement is hired.Outreach and TrainingAs science tackles increasinglycomplex problems, interdisciplinaryresearch becomes more essential,because interdisciplinary teams bringa variety of knowledge, approaches,and skills to solve the problem.Computational science is noexception. That’s why outreach andtraining, which promote diversity inthe computational science community,are essential to NERSC’s mission.Collaborations between institutionsare one fruitful avenue of outreach.Examples include NERSC’s technicalexchanges with KISTI (Korea), CSCS(Switzerland), and KAUST (SaudiArabia); the data transfer workinggroup activities with Oak Ridge andArgonne national labs; procurement andtechnical exchanges with Sandia; Craysystems administrator interactionswith Oak Ridge; and centermanagement interchanges betweenNERSC, Argonne, and Oak Ridge.What began as an informal discussiontwo years ago between a physicistand a computer scientist has ledto a new international collaborationaimed at creating computationaltools to help scientists make moreeffective use of new computingtechnologies, including multicoreprocessors. The International Centerfor Computational Science (ICCS)is located at Berkeley Lab/NERSCand the University of Californiaat Berkeley, with partners at theUniversity of Heidelberg in Germany,the National Astronomical Observatoriesof the Chinese Academy of Sciences,and Nagasaki University in Japan.ICCS explores the vast and rapidlychanging space of emerging hardwaredevices, programming models,and algorithm techniques to developeffective and optimal tools for enablingscientific discovery and growth.Top row, left to right: Horst Simon, Kathy Yelick, Jonathan Carter. Bottom row, left toright: Katie Antypas, Jim Craw, Jeff Broughton.ICCS hosts several collaborativeresearch programs, including ISAAC,a three-year NSF-funded project tofocus on research and developmentof infrastructure to accelerate physicsand astronomy applications usingmulticore architectures; and the GRACEProject and the Silk Road Project,which are developing programmablehardware for astrophysical simulations.
NERSC Center81Figure 15. Researchers with scientificbackgrounds ranging from astrophysicsto biochemistry attended the firstcomputational science summer schoolat NERSC. Photo: Hemant Shukla.ICCS also offers training inprogramming multicore devicessuch as GPUs for advanced scientificcomputation. In August 2010, ICCSoffered a five-day course at NERSC,“Proven Algorithmic Techniquesfor Many-Core Processors,” incollaboration with the VirtualSchool of Computational Scienceand Engineering, a project of theUniversity of Illinois and NVIDIA.Participants applied the techniquesin hands-on lab sessions usingNERSC computers (Figure 15).Seating was limited and manypeople had to be turned away,so ICCS offered a second five-dayworkshop at UC Berkeley in January2011 on “Manycore and Accelerator-Based High-Performance ScientificComputing,” which also includedhands-on tutorials.NERSC staff have been activelyinvolved in developing and deliveringtutorials at SciDAC meetings, at theSC conference, and for the TeraGrid.One hundred participants attendedtutorials at the July 2010 SciDACconference, which were organized bythe SciDAC Outreach Center, whichis hosted at NERSC. The SciDACOutreach Center has provided HPCoutreach to industry, identifyingcomputing needs in industrial areasthat reinforce the DOE mission withthe Council on Competiveness. Anexample is working with UnitedTechnologies Corporation (UTC) oncode profiling, HPC scalability, andcode tuning, with the result of a 3.6xfaster time to solution for one classof problems.NERSC has also extended itsoutreach to local high schools.In June 2010, NERSC hosted 12students from Oakland TechnicalHigh School’s Computer Scienceand Technology Academy, a smallacademy within the larger highschool, for an introduction tocomputational science andsupercomputer architecture, and atour of NERSC (Figure 16). In July,Berkeley Lab Computing Scienceshosted 14 local high school studentsfor a four-day Careers in ComputingCamp, with sessions at both themain Lab and NERSC’s OaklandScientific Facility (Figure 17). Theprogram was developed with inputfrom computer science teachers atBerkeley High, Albany High, KennedyHigh in Richmond, and OaklandTech, and included presentations,hands-on activities, and tours ofvarious facilities. Staff from NERSC,CRD, ESnet, and the IT Divisionpresented a wide range of topicsincluding assembling a desktopcomputer, cyber security war stories,algorithms for combustion andFigure 16. Dave Paul, a systems engineer, brought out computer nodes and parts fromNERSC’s older systems to show Oakland Tech students how the components have becomeboth more dense and more power efficient as the technology has evolved over time.Photo: Roy Kaltschmidt, Berkeley Lab Public Affairs.Figure 17. Students from the BerkeleyLab Careers In Computing Camp posein front of NERSC’s Cray XT5supercomputer, Franklin. Photo: RoyKaltschmidt, Berkeley Lab Public Affairs.
82 NERSC Annual Report 2010-2011astrophysics, the role of appliedmath, networking, science portals,and hardware/software co-design.To reach an even broader audience,NERSC, ESnet, and ComputationalResearch Division staff arecontributing to Berkeley Lab’sVideo Glossary, a popular websitewhere scientific terms are definedin lay language (http://videoglossary.lbl.gov/). Computing-related videosto date include:• bits and bytes (Inder Monga, ESnet)• computational science (AssociateLab Director Horst Simon)• green computing (John Shalf,NERSC)• IPv6 (Michael Sinatra, ESnet)• petabytes (Inder Monga, ESnet)• petaflop computing (David Bailey,CRD)• quantum computing (ThomasSchenkel, Accelerator and FusionResearch Division)• science network (Eli Dart, ESnet)• supercomputing (NERSC DirectorKathy Yelick)Research andDevelopment byNERSC StaffStaying ahead of the technologicalcurve, anticipating problems,and developing proactive solutionsare part of NERSC’s culture. Manystaff members collaborate oncomputer science research projects,scientific code development, anddomain-specific research, as wellas participating in professionalorganizations and conferencesand contributing to journals andproceedings. The NERSC usercommunity benefits from theresults of these activities as theyare applied to systems, software,and services at NERSC andthroughout the HPC community.In 2010, four conference papersco-authored by NERSC staff wonBest Paper awards:• MPI-Hybrid Parallelismfor Volume Rendering on Large,Multi-Core Systems. Mark Howison,E. Wes Bethel, and Hank Childs.Proceedings of the EurographicsSymposium on Parallel Graphics andVisualization (EGPGV’10), Norrköping,Sweden, May 2–3, 2010.• Application Acceleration onCurrent and Future CrayPlatforms. Alice Koniges, RobertPreissl, Jihan Kim, David Eder,Aaron Fisher, Nathan Masters,Velimir Mlaker, Stephane Ethier,Weixing Wang, Martin Head-Gordon, and Nathan Wichmann.Proceedings of the 2010 CrayUser Group Meeting, Edinburgh,Scotland, May 24–27, 2010.• Seeking Supernovae in the Clouds:A Performance Study. Keith Jackson,Lavanya Ramakrishnan, Karl Runge,and Rollin Thomas. ScienceCloud2010, the 1st Workshop onScientific Cloud Computing,Chicago, Ill., June 2010.• Performance Analysis of HighPerformance ComputingApplications on the Amazon WebServices Cloud. Keith R. Jackson,Lavanya Ramakrishnan, KrishnaMuriki, Shane Canon, ShreyasCholia, John Shalf, Harvey J.Wasserman, and Nicholas J.Wright. Proceedings of the 2ndIEEE International Conferenceon Cloud Computing Technologyand Science (CloudCom 2010),Bloomington, Ind., Nov. 30–Dec.1, 2010.Alice Koniges of NERSC’s AdvancedTechnologies Group and Gabriele Jostof the Texas Advanced ComputingCenter were guest editors of a specialissue of the journal ScientificProgramming titled “ExploringLanguages for Expressing Mediumto Massive On-Chip Parallelism”(Vol. 18, No. 3–4, 2010). SeveralNERSC researchers co-authoredarticles in this issue:• Guest Editorial: Special Issue:Exploring Languages forExpressing Medium to MassiveOn-Chip Parallelism. Gabriele Jostand Alice Koniges.• Overlapping Communication withComputation using OpenMP Taskson the GTS Magnetic Fusion Code.Robert Preissl, Alice Koniges,Stephan Ethier, Weixing Wang,and Nathan Wichmann.• A Programming ModelPerformance Study using the NASParallel Benchmarks. HongzhangShan, Filip Blagojeviëć, Seung-JaiMin, Paul Hargrove, Haoqiang Jin,Karl Fuerlinger, Alice Koniges, andNicholas J. Wright.
NERSC Center83A paper co-authored by DanielaUshizima of NERSC’s Anaytics Groupwas on two successive lists of the“Top 25 Hottest Articles” based onnumber of downloads from thejournal Digital Signal Processing,from April to September 2010:• SAR Imagery Segmentation byStatistical Region Growing andHierarchical Merging. E. A. Carvalho,D. M. Ushizima, F. N. S. Medeiros,C. I. O. Martins, R. C. P. Marques,and I. N. S. Oliveira. Digital SignalProcessing, Volume 20, Issue 5,pp. 1365–1378, Sept 2010.Other publications by NERSC staffin 2010 are listed below. (Not allco-authors are from NERSC.) Manyof these papers are available onlineat https://www.nersc.gov/newspublications/publications-reports/nersc-staff-publications/.ALE-AMR: A New 3D Multi-PhysicsCode for Modeling Laser/Target Effects.A. E. Koniges, N. D. Masters, A. C.Fisher, R. W. Anderson, D. C. Eder,T. B. Kaiser, D. S. Bailey, B. Gunney,P. Wang, B. Brown, K. Fisher,F. Hansen, B. R. Maddox, D. J.Benson, M. Meyers, and A. Geille.Proceedings of the Sixth InternationalConference on Inertial FusionSciences and Applications, Journalof Physics: Conference Series, Vol.244 (3), 032019.An Auto-Tuning Framework forParallel Multicore StencilComputations. S. Kamil, C. Chan,L. Oliker, J. Shalf, and S. Williams.2010 IEEE International Symposiumon Parallel & Distributed Processing(IPDPS), pp.1–12, April 19–23, 2010.An Unusually Fast-EvolvingSupernova. Poznanski, Dovi;Chornock, Ryan; Nugent, Peter E.;Bloom, Joshua S.; Filippenko, AlexeiV.; Ganeshalingam, Mohan; Leonard,Douglas C.; Li, Weidong; Thomas,Rollin C. Science, Volume 327,Issue 5961, pp. 58– (2010).Analyzing and Tracking BurningStructures in Lean PremixedHydrogen Flames. P.-T. Bremer,G. H. Weber, V. Pascucci, M. S. Day,and John B. Bell. IEEE Transactionson Visualization and ComputerGraphics, 16(2): 248–260, 2010.Analyzing the Effect of DifferentProgramming Models uponPerformance and Memory Usage onCray XT5 Platforms. H. Shan, H. Jin,K. Fuerlinger, A. Koniges, and N.Wright. Proceedings of the 2010Cray User Group, May 24–27, 2010,Edinburgh, Scotland.Assessment and Mitigation ofRadiation, EMP, Debris and ShrapnelImpacts at Mega-Joule Class LaserFacilities. D. C. Eder, et al. J. Physics,Conf. Ser. 244: 0320018, 2010.Auto-Tuning Stencil Computationson Diverse Multicore Architectures.K. Datta, S. Williams, V. Volkov,J. Carter, L. Oliker, J. Shalf, andK. Yelick. In Scientific Computingwith Multicore and Accelerators,edited by: Jakub Kurzak, David A.Bader, and Jack Dongarra. CRCPress, 2010.Automated Detection and Analysisof Particle Beams in Laser-PlasmaAccelerator Simulations. DanielaUshizima, Cameron Geddes, EstelleCormier-Michel, E. Wes Bethel,Janet Jacobsen, Prabhat, OliverRübel, Gunther Weber, BernardHamann, Peter Messmer, and HansHagen. In-Tech, 367–389, 2010.Building a Physical Cell Simulationand Comparing with ConfocalMicroscopy. C. Rycroft, D. M.Ushizima, R. Saye, C. M. Ghajar,J. A. Sethian. Bay Area PhysicalSciences–Oncology Center (NCI)Meeting 2010, UCSF MedicalSciences, Sept. 2010.Communication Requirements andInterconnect Optimization forHigh-End Scientific Applications.Shoaib Kamil, Leonid Oliker, AliPinar, and John Shalf. IEEE Trans.Parallel Distrib. Syst. 21(2):188–202 (2010).Compressive Auto-Indexing inFemtosecond Nanocrystallography.Filipe Maia, Chao Yang, and StefanoMarchesini. Lawrence BerkeleyNational Laboratory technical reportLBNL-4008E, 2010.Compressive Phase ContrastTomography. Filipe Maia, AlastairMacDowell, Stefano Marchesini,Howard A. Padmore, Dula Y.Parkinson, Jack Pien, AndreSchirotzek, and Chao Yang. SPIEOptics and Photonics 2010, SanDiego, California.Core-Collapse Supernovaefrom the Palomar TransientFactory: Indications for a DifferentPopulation in Dwarf Galaxies.Arcavi, Iair; Gal-Yam, Avishay;Kasliwal, Mansi M.; Quimby, RobertM.; Ofek, Eran O.; Kulkarni,Shrinivas R.; Nugent, Peter E.;Cenko, S. Bradley; Bloom, Joshua S.;Sullivan, Mark; Howell, D. Andrew;Poznanski, Dovi; Filippenko, AlexeiV.; Law, Nicholas; Hook, Isobel;Jönsson, Jakob; Blake, Sarah;Cooke, Jeff; Dekany, Richard;Rahmer, Gustavo; Hale, David;Smith, Roger; Zolkower, Jeff; Velur,Viswa; Walters, Richard; Henning,John; Bui, Kahnh; McKenna, Dan;Jacobsen, Janet. AstrophysicalJournal, Volume 721, Issue 1,pp. 777–784 (2010).
84 NERSC Annual Report 2010-2011Coupling Visualization and DataAnalysis for Knowledge Discoveryfrom Multi-Dimensional ScientificData. Oliver Rübel, Sean Ahern, E.Wes Bethel, Mark D. Biggin, HankChilds, Estelle Cormier-Michel,Angela DePace, Michael B. Eisen,Charless C. Fowlkes, Cameron G. R.Geddes, Hans Hagen, BerndHamann, Min-Yu Huang, Soile V. E.Keränen, David W. Knowles, Cris L.Luengo Hendriks, Jitendra Malik,Jeremy Meredith, Peter Messmer,Prabhat, Daniela Ushizima, GuntherH. Weber, and Kesheng Wu.Proceedings of InternationalConference on ComputationalScience, ICCS 2010, May 2010.Defining Future PlatformRequirements for E-ScienceClouds. Lavanya Ramakrishnan,Keith R. Jackson, Shane Canon,Shreyas Cholia, and John Shalf.Proceedings of the 1st ACMSymposium on Cloud Computing(SoCC ‘10). ACM, New York, NY,USA, 101–106 (2010).Discovery of Precursor LuminousBlue Variable Outbursts in TwoRecent Optical Transients: TheFitfully Variable Missing Links UGC2773-OT and SN 2009ip. Smith,Nathan; Miller, Adam; Li, Weidong;Filippenko, Alexei V.; Silverman,Jeffrey M.; Howard, Andrew W.;Nugent, Peter; Marcy, Geoffrey W.;Bloom, Joshua S.; Ghez, Andrea M.;Lu, Jessica; Yelda, Sylvana;Bernstein, Rebecca A.; Colucci,Janet E. Astronomical Journal,Volume 139, Issue 4, pp. 1451–1467 (2010).Electromagnetic Counterparts ofCompact Object Mergers Powered bythe Radioactive Decay of r-ProcessNuclei. Metzger, B. D.; Martínez-Pinedo, G.; Darbha, S.; Quataert, E.;Arcones, A.; Kasen, D.; Thomas, R.;Nugent, P.; Panov, I. V.; Zinner,N. T. Monthly Notices of the RoyalAstronomical Society, Volume 406,Issue 4, pp. 2650–2662 (2010).Exploitation of DynamicCommunication Patterns throughStatic Analysis. Robert Preissl, BronisR. de Supinski, Martin Schulz, DanielJ. Quinlan, Dieter Kranzlmüller, andThomas Panas. Proc. InternationalConference on Parallel Processing(ICPP), Sept. 13–16, 2010.Extreme Scaling of ProductionVisualization Software on DiverseArchitectures. H. Childs, D. Pugmire,S. Ahern, B. Whitlock, M. Howison,Prabhat, G. Weber, and E. W. Bethel.Computer Graphics and Applications,Vol. 30 (3), pp. 22–31, May/June 2010.File System Monitoring as a Windowinto User I/O Requirements. A. Uselton,K. Antypas, D. M. Ushizima, and J.Sukharev. Proceedings of the 2010Cray User Group Meeting, Edinburgh,Scotland, May 24–27, 2010.H5hut: A High-Performance I/O Libraryfor Particle-Based Simulations. MarkHowison, Andreas Adelmann, E. WesBethel, Achim Gsell, Benedikt Oswald,and Prabhat. Proceedings of IASDS10— Workshop on Interfaces andAbstractions for Scientific DataStorage, Heraklion, Crete, Greece,September 2010.HPSS in the Extreme Scale Era:Report to DOE Office of Scienceon HPSS in 2018–2022. D. Cook,J. Hick, J. Minton, H. Newman,T. Preston, G. Rich, C. Scott,J. Shoopman, J. Noe, J. O’Connell,G. Shipman, D. Watson, and V.White. Lawrence Berkeley NationalLaboratory technical report LBNL-3877E, 2010.Hybrid Parallelism for VolumeRendering on Large, Multi-CoreSystems. M. Howison, E. W. Bethel,and H. Childs. Journal of Physics:Conference Series, Proceedingsof SciDAC 2010, July 2010,Chattanooga TN.Integrating Data Clustering andVisualization for the Analysis of 3DGene Expression Data. O. Rübel, G.H. Weber, M-Y Huang, E. W. Bethel,M. D. Biggin, C. C. Fowlkes, C.Luengo Hendriks, S. V. E. Keränen,M. Eisen, D. Knowles, J. Malik,H. Hagen and B. Hamann. IEEETransactions on ComputationalBiology and Bioinformatics, Vol. 7,No.1, pp. 64–79, Jan/Mar 2010.International Exascale Software Project(IESP) Roadmap, version 1.1. JackDongarra et al. [incl. John Shalf,David Skinner, and Kathy Yelick].October 18, 2010, www.exascale.org.Large Data Visualization onDistributed Memory Multi-GPUClusters. T. Fogal, H. Childs, S.Shankar, J. Kruger, R. D. Bergeron,and P. Hatcher. High PerformanceGraphics 2010, Saarbruken,Germany, June 2010.Efficient Bulk Data Replication forthe Earth System Grid. A. Sim, D.Gunter, V. Natarajan, A. Shoshani,D. Williams, J. Long, J. Hick, J. Lee,and E. Dart. International Symposiumon Grid Computing, 2010.High Performance Computing andStorage Requirements for HIghEnergy Physics. Richard Gerber andHarvey Wasserman, eds. LawrenceBerkeley National Laboratorytechnical report LBNL-4128E, 2010.Laser Ray Tracing in a ParallelArbitrary Lagrangian Eulerian AdaptiveMesh Refinement Hydrocode. N.Masters, D. C. Eder, T. B. Kaiser, A.E. Koniges, and A. Fisher. J. Physics,Conf. Ser. 244: 032022, 2010.
NERSC Center85Lessons Learned from MovingEarth System Grid Data Sets overa 20 Gbps Wide-Area Network.R. Kettimuthu, A. Sim, D. Gunter,W. Allcock, P.T. Bremer, J. Bresnahan,A. Cherry, L. Childers, E. Dart,I. Foster, K. Harms, J. Hick, J. Lee,M. Link, J. Long, K. Miller, V.Natarajan, V. Pascucci, K. Raffenetti,D. Ressman, D. Williams, L. Wilson,and L. Winkler. Proceedings of the19th ACM International Symposiumon High Performance DistributedComputing (HPDC ‘10). ACM, NewYork, NY, USA, 316–319.Magellan: Experiences froma Science Cloud. LavanyaRamakrishnan, Piotr T. Zbiegel,Scott Campbell, Rick Bradshaw,Richard Shane Canon, SusanCoghlan, Iwona Sakrejda, NarayanDesai, Tina Declerck, and AnpingLiu. Proceedings of the 2ndInternational Workshop on ScientificCloud Computing (ScienceCloud ‘11).ACM, New York, NY, USA, 49–58.Modeling Heat Conduction andRadiation Transport with theDiffusion Equation in NIF ALE-AMR.A. Fisher, D. S. Bailey, T. B. Kaiser,B. T. N. Gunney, N. D. Masters,A. E. Koniges, D. C. Eder, and R. W.Anderson. J. Physics, Conf. Ser. 244:022075, 2010.Multi-Level Bitmap Indexes forFlash Memory Storage. KeshengWu, Kamesh Madduri, and ShaneCanon. IDEAS ‘10: Proceedingsof the Fourteenth InternationalDatabase Engineering andApplications Symposium,Montreal, QC, Canada, 2010.Nearby Supernova FactoryObservations of SN 2007if: FirstTotal Mass Measurement of aSuper-Chandrasekhar-MassProgenitor. Scalzo, R. A.; Aldering,G.; Antilogus, P.; Aragon, C.; Bailey,S.; Baltay, C.; Bongard, S.; Buton,C.; Childress, M.; Chotard, N.; Copin,Y.; Fakhouri, H. K.; Gal-Yam, A.;Gangler, E.; Hoyer, S.; Kasliwal,M.; Loken, S.; Nugent, P.; Pain, R.;Pécontal, E.; Pereira, R.; Perlmutter,S.; Rabinowitz, D.; Rau, A.;Rigaudier, G.; Runge, K.; Smadja,G.; Tao, C.; Thomas, R. C.; Weaver,B.; Wu, C. Astrophysical Journal,Volume 713, Issue 2, pp. 1073–1094 (2010).Nickel-Rich Outflows Producedby the Accretion-Induced Collapseof White Dwarfs: Light Curves andSpectra. Darbha, S.; Metzger, B.D.; Quataert, E.; Kasen, D.;Nugent, P.; Thomas, R. MonthlyNotices of the Royal AstronomicalSociety, Volume 409, Issue 2, pp.846–854 (2010).Ocular Fundus for RetinopathyCharacterization. D. M. Ushizima andJ. Cuadros. Bay Area Vision Meeting,Feb. 5, Berkeley, CA, 2010.Optimizing and Tuning the FastMultipole Method for State-of-the-ArtMulticore Architectures. A.Chandramowlishwaran, S. Williams,L. Oliker, I. Lashuk, G. Biros, and R.Vuduc. Proceedings of 2010 IEEEInternational Symposium on Parallel& Distributed Processing (IPDPS),April 2010.Parallel Algorithms for MovingLagrangian Data on Block StructuredEulerian Meshes. A. Dubey, K. B.Antypas, C. Daley. Parallel Computing,Vol. 37 (2): 101–113 (2011).Parallel and Streaming Generationof Ghost Data for Structured Grids.M. Isenberg, P. Lindstrom, and H.Childs. Computer Graphics andApplications, Vol. 30 (3), pp. 50–62,May/June 2010.Parallel I/O Performance: From Eventsto Ensembles. Andrew Uselton, MarkHowison, Nicholas J. Wright, DavidSkinner, Noel Keen, John Shalf,Karen L. Karavanic, and Leonid Oliker.Proceedings of 2010 IEEE InternationalSymposium on Parallel & DistributedProcessing (IPDPS), 19–23 April2010; Atlanta, Georgia; pp.1-11.Performance Analysis of Commodityand Enterprise Class Flash Devices.N. Master, M. Andrews, J. Hick, S.Canon, and N. Wright. PetascaleData Storage Workshop (PDSW)2010, Nov. 2010.Petascale Parallelization of theGyrokinetic Toroidal Code. S. Ethier,M. Adams, J. Carter, and L. Oliker.VECPAR: High PerformanceComputing for ComputationalScience, June 2010.Quantitative Visualization of ChIPchipData by Using Linked Views.Min-Yu Huang, Gunther H. Weber,Xiao-Yong Li, Mark D. Biggin, andBernd Hamann. Proceedings IEEEInternational Conference onBioinformatics and Biomedicine2010 (IEEE BIBM 2010) Workshops,Workshop on Integrative DataAnalysis in Systems Biology (IDASB),pp. 195–200.Rapidly Decaying Supernova 2010X:A Candidate “Ia” Explosion. Kasliwal,Mansi M.; Kulkarni, S. R.; Gal-Yam,Avishay; Yaron, Ofer; Quimby, RobertM.; Ofek, Eran O.; Nugent, Peter;Poznanski, Dovi; Jacobsen, Janet;Sternberg, Assaf; Arcavi, Iair;Howell, D. Andrew; Sullivan, Mark;Rich, Douglas J.; Burke, Paul F.;Brimacombe, Joseph; Milisavljevic,Dan; Fesen, Robert; Bildsten, Lars;Shen, Ken; Cenko, S. Bradley;Bloom, Joshua S.; Hsiao, Eric; Law,Nicholas M.; Gehrels, Neil; Immler,Stefan; Dekany, Richard; Rahmer,
86 NERSC Annual Report 2010-2011Gustavo; Hale, David; Smith, Roger;Zolkower, Jeff; Velur, Viswa; Walters,Richard; Henning, John; Bui, Kahnh;McKenna, Dan. Astrophysical JournalLetters, Volume 723, Issue 1, pp.L98–L102 (2010).Recent Advances in VisIt: AMRStreamlines and Query-DrivenVisualization. G. H. Weber, S. Ahern,E. W. Bethel, S. Borovikov, H. R.Childs, E. Deines, C. Garth, H. Hagen,B. Hamann, K. I. Joy, D. Martin, J.Meredith, Prabhat, D. Pugmire, O.Rübel, B. Van Straalen, and K. Wu.Numerical Modeling of SpacePlasma Flows: Astronum-2009(Astronomical Society of the PacificConference Series), Vol. 429, pp.329–334, 2010.Resource Management in theTessellation Manycore OS. J. A.Colmenares, S. Bird, H. Cook, P.Pearce, D. Zhu, J. Shalf, K. Asanovic,and J. Kubiatowicz. Proceedings ofthe Second USENIX Workshop onHot Topics in Parallelism (HotPar’10),Berkeley, California, June 2010.Retinopathy Diagnosis from OcularFundus Image Analysis. D. M.Ushizima and F. Medeiros. Modelingand Analysis of Biomedical Image,SIAM Conference on Imaging Science(IS10), Chicago, Il, April 12–14, 2010.SDWFS-MT-1: A Self-ObscuredLuminous Supernova at z ~= 0.2.Kozłowski, Szymon; Kochanek, C. S.;Stern, D.; Prieto, J. L.; Stanek, K. Z.;Thompson, T. A.; Assef, R. J.; Drake,A. J.; Szczygieł, D. M.; Woźniak, P. R.;Nugent, P.; Ashby, M. L. N.; Beshore,E.; Brown, M. J. I.; Dey, Arjun;Griffith, R.; Harrison, F.; Jannuzi, B. T.;Larson, S.; Madsen, K.; Pilecki, B.;Pojmański, G.; Skowron, J.; Vestrand,W. T.; Wren, J. A. AstrophysicalJournal, Volume 722, Issue 2, pp.1624–1632 (2010).Silicon Nanophotonic Network-On-Chip Using TDM Arbitration. GilbertHendry, Johnnie Chan, Shoaib Kamil,Lenny Oliker, John Shalf, Luca P.Carloni, and Keren Bergman. IEEESymposium on High PerformanceInterconnects (HOTI) 5.1 (Aug. 2010).SN 2008iy: An Unusual Type IInSupernova with an Enduring 400-dRise Time. Miller, A. A.; Silverman,J. M.; Butler, N. R.; Bloom, J. S.;Chornock, R.; Filippenko, A. V.;Ganeshalingam, M.; Klein, C. R.; Li,W.; Nugent, P. E.; Smith, N.; Steele,T. N. Monthly Notices of the RoyalAstronomical Society, Volume 404,Issue 1, pp. 305–317 (2010).Spectra and Hubble Space TelescopeLight Curves of Six Type Ia Supernovaeat 0.511 < z < 1.12 and the Union2Compilation. Amanullah, R.; Lidman,C.; Rubin, D.; Aldering, G.; Astier, P.;Barbary, K.; Burns, M. S.; Conley, A.;Dawson, K. S.; Deustua, S. E.; Doi,M.; Fabbro, S.; Faccioli, L.;Fakhouri, H. K.; Folatelli, G.;Fruchter, A. S.; Furusawa, H.;Garavini, G.; Goldhaber, G.; Goobar,A.; Groom, D. E.; Hook, I.; Howell, D.A.; Kashikawa, N.; Kim, A. G.; Knop,R. A.; Kowalski, M.; Linder, E.;Meyers, J.; Morokuma, T.; Nobili, S.;Nordin, J.; Nugent, P. E.; Östman,L.; Pain, R.; Panagia, N.; Perlmutter,S.; Raux, J.; Ruiz-Lapuente, P.;Spadafora, A. L.; Strovink, M.;Suzuki, N.; Wang, L.; Wood-Vasey,W. M.; Yasuda, N.; SupernovaCosmology Project, The. AstrophysicalJournal, Volume 716, Issue 1, pp.712–738 (2010).Streamline Integration usingMPI-Hybrid Parallelism on LargeMulti-Core Architecture. David Camp,Christoph Garth, Hank Childs, DavidPugmire, and Kenneth Joy. IEEETransactions on Visualization andComputer Graphics, 07 Dec. 2010.Supernova PTF 09UJ: A PossibleShock Breakout from a DenseCircumstellar Wind. Ofek, E. O.;Rabinak, I.; Neill, J. D.; Arcavi, I.;Cenko, S. B.; Waxman, E.; Kulkarni,S. R.; Gal-Yam, A.; Nugent, P. E.;Bildsten, L.; Bloom, J. S.;Filippenko, A. V.; Forster, K.;Howell, D. A.; Jacobsen, J.;Kasliwal, M. M.; Law, N.; Martin,C.; Poznanski, D.; Quimby, R. M.;Shen, K. J.; Sullivan, M.; Dekany,R.; Rahmer, G.; Hale, D.; Smith, R.;Zolkower, J.; Velur, V.; Walters, R.;Henning, J.; Bui, K.; McKenna, D.Astrophysical Journal, Volume 724,Issue 2, pp. 1396–1401 (2010).Transitioning Users fromthe Franklin XT4 System to theHopper XE6 System. K. Antypas,Y. He. CUG Procceedings 2011,Fairbanks, Alaska.Two-Stage Framework fora Topology-Based Projectionand Visualization of ClassifiedDocument Collections. PatrickOesterling, Gerik Scheuermann,Sven Teresniak, Gerhard Heyer,Steffen Koch, Thomas Ertl, andGunther H. Weber. ProceedingsIEEE Symposium on VisualAnalytics Science and Technology(IEEE VAST), Salt Lake City, Utah,October 2010.Type II-P Supernovae as StandardCandles: The SDSS-II SampleRevisited. Poznanski, Dovi; Nugent,Peter E.; Filippenko, Alexei V.Astrophysical Journal, Volume 721,Issue 2, pp. 956–959 (2010).Unveiling the Origin of Grb090709A: Lack of Periodicityin a Reddened CosmologicalLong-Duration Gamma-Ray Burst.Cenko, S. B.; Butler, N. R.; Ofek, E.O.; Perley, D. A.; Morgan, A. N.;Frail, D. A.; Gorosabel, J.; Bloom, J.
NERSC Center87S.; Castro-Tirado, A. J.; Cepa, J.;Chandra, P.; de Ugarte Postigo, A.;Filippenko, A. V.; Klein, C. R.;Kulkarni, S. R.; Miller, A. A.;Nugent, P. E.; Starr, D. L.Astronomical Journal, Volume 140,Issue 1, pp. 224–234 (2010).Vessel Network Detection UsingContour Evolution and ColorComponents. D. M. Ushizima,F. N. S. Medeiros, J. Cuadros,C. I. O. Martins. Proceedingsof the 32nd Annual InternationalConference of the IEEE Engineeringin Medicine and Biology Society,Buenos Aires, Argentina. Sept 2010.Visualization and Analysis-OrientedReconstruction of MaterialInterfaces. J. Meredith and H. Childs.Eurographics/IEEE Symposium onVisualization (EuroVis), Bordeaux,France, June 2010.What’s Ahead for Fusion Computing?Alice Koniges, Robert Preissl,Stephan Ethier, and John Shalf.International Sherwood FusionTheory Conference, April 2010,Seattle, Washington.
88 NERSC Annual Report 2010-2011
Appendices89Appendix ANERSC Policy BoardThe NERSC Policy Board meets at least annually and provides scientific andexecutive-level advice to the Berkeley Lab Director regarding the overall NERSCprogram and, specifically, on such issues as resource utilization to maximize thepresent and future scientific impact of NERSC, and long-range planning forthe program, including the research and development necessary for futurecapabilities. Policy Board members are widely respected leaders in science,computing technology, or the management of scientific research and/or facilities.Daniel A. Reed, Policy Board ChairCorporate Vice President, Extreme Computing GroupMicrosoft CorporationAchim BachemChairman of the Board of DirectorsForschungszentrum Jülich GmHJack DongarraUniversity Distinguished ProfessorDirector, Innovative Computing LaboratoryUniversity of TennesseeStephane EthierEx officio, NERSC Users Group ChairComputational Plasma Physics GroupPrinceton Plasma Physics LaboratorySidney KarinProfessor of Computer Science and EngineeringUniversity of California, San DiegoPier OddoneDirectorFermi National Accelerator LaboratoryEdward SeidelAssistant Director for Mathematics and Physical SciencesNational Science Foundation
90 NERSC Annual Report 2010-2011Appendix BNERSC Client StatisticsIn support of the DOE Office of Science’s mission, NERSC served 4,294 scientiststhroughout the United States in 2010. These researchers work in DOE laboratories,universities, industry, and other Federal agencies. Figure 1 shows the proportionof NERSC usage by each type of institution, while Figures 2 and 3 show laboratory,university, and other organizations that used large allocations of computer time.Computational science conducted at NERSC covers the entire range of scientificdisciplines, but is focused on research that supports the DOE’s mission andscientific goals, as shown in Figure 4.On their Allocation Year 2011 proposal forms, principal investigators reported1,528 refereed publications (published or in press)—as well as 262 publicationssubmitted to refereed journals—for the preceding 12 months, based, at leastin part, on using NERSC resources. Lists of publications resulting from use ofNERSC resources are available at https://www.nersc.gov/news-publications/publications-reports/nersc-user-publications/.The MPP hours reported here are Cray XT4 equivalent hours.MPP Hours (in millions)0 5 10 15 20 25 30 35 40MPP Hours (in millions)0 10 20 30 40 50 60Universities58.3%DOE Labs32.9%Other Government Labs 7.4%Industry 1.4%Berkeley LabNCAROak RidgePPPLPNNLArgonneNRELGeneral AtomicsSLACLivermore LabBrookhaven LabSandia Lab CAFermilabLos Alamos LabAmes LabIdaho National LabNOAATech-X Corp.Naval Research LabJefferson LabNASA GISSJGIArmy Engineer CenterOthers (8)19,566,33212,098,44711,218,8338,240,8396,860,1463,421,4503,229,6563,214,4803,014,4482,994,4572,948,0312,762,9981,508,4551,239,4651,059,134722,827667,089408,951401,467347,325266,382112,398296,86732,687,481Figure 1. NERSC usage by institutiontype, 2010 (MPP hours).Figure 2. DOE and other laboratoryusage at NERSC, 2010 (MPP hours).
Appendices91MPP Hours (in millions)0 5 10 15 20U. Washington14,915,012U. Arizona14,464,867UC Santa Cruz14,356,525UC Berkeley12,725,264UCLA7,385,969U. Missouri KC5,715,209Auburn Univ4,939,672MIT4,795,298U. Colorado Boulder4,722,811Stanford Univ4,711,737U. Wisc. Madison4,436,563U. Texas Austin4,103,468Courant Inst4,008,502UC Irvine3,815,734Cal Tech3,482,915Northwestern2,922,222U. Illinois U-C2,429,845U. AlaskaHarvard UnivU. Central FloridaWilliam & MaryUC Santa BarbaraIndiana St. U.2,271,1992,192,2612,139,0242,047,8541,983,7431,952,762MPP Hours (in millions)0 10 20 30 40 50 60Materials Science18.0%Fusion EnergyClimate Research13.6%16.5%UC San DiegoU. MarylandPrinceton UnivU. New Hampshire1,935,1431,931,7341,824,8991,676,698Lattice Gauge TheoryChemistryAstrophysics6.6%12.6%12.4%Florida State1,658,498Accelerator Physics5.5%Iowa StateVanderbilt UnivU. Florida1,591,3371,456,6601,362,126Life SciencesCombustion2.1%4.9%Colorado State1,321,559Environmental Sciences1.8%U. KentuckyYale UnivUC DavisU. Utah1,313,2871,173,4111,168,2011,154,251GeosciencesNuclear PhysicsApplied Math1.7%1.6%1.4%North Carolina State1,123,068Computer Sciences0.7%Louisiana StateGeorge Wash UnivTexas A&M Univ1,083,7701,059,7751,020,161EngineeringHigh Energy Physics0.6%0.016%Others (67)16,395,335Other0.001%Figure 3. Academic usage at NERSC,2010 (MPP hours).Figure 4. NERSC usage by scientificdiscipline, 2010 (MPP hours).
92 NERSC Annual Report 2010-2011Appendix CNERSC Users Group Executive CommitteeOffice of Advanced Scientific Computing ResearchMark Adams, Columbia UniversitySergei Chumakov, Stanford UniversityMike Lijewski, Lawrence Berkeley National LaboratoryOffice of Basic Energy SciencesEric Bylaska, Pacific Northwest National LaboratoryPaul Kent, Oak Ridge National LaboratoryBrian Moritz, SLAC National Accelerator LaboratoryOffice of Biological and Environmental ResearchDavid Beck, University of WashingtonBarbara Collignon, Oak Ridge National Laboratory**Brian Hingerty, Oak Ridge National Laboratory*Adrianne Middleton, National Center for Atmospheric ResearchOffice of Fusion Energy SciencesStephane Ethier (Chair), Princeton Plasma Physics LaboratoryLinda Sugiyama, Massachusetts Institute of TechnologyJean-Luc Vay, Lawrence Berkeley National LaboratoryOffice of High Energy PhysicsJulian Borrill, Lawrence Berkeley National LaboratoryCameron Geddes, Lawrence Berkeley National Laboratory*Steven Gottlieb, Indiana University**Frank Tsung (Vice Chair), University of California, Los AngelesOffice of Nuclear PhysicsDavid Bruhwiler, Tech-X Corporation*Daniel Kasen, Lawrence Berkeley National Laboratory**Peter Messmer, Tech-X Corporation*Tomasz Plewa, Florida State UniversityMartin Savage, University of Washington**Members at LargeAngus Macnab, DyNuSim, LLC*Gregory Newman, Lawrence Berkeley National Laboratory**Jason Nordhaus, Princeton UniversityNed Patton, National Center for Atmospheric Research*Maxim Umansky, Lawrence Livermore National Laboratory* Outgoing members** Incoming members
Appendices93Appendix DOffice of Advanced Scientific Computing ResearchThe mission of the Advanced Scientific Computing Research (ASCR) programis to discover, develop, and deploy computational and networking capabilitiesto analyze, model, simulate, and predict complex phenomena important to theDepartment of Energy (DOE). A particular challenge of this program is fulfillingthe science potential of emerging computing systems and other novel computingarchitectures, which will require numerous significant modifications to today’stools and techniques to deliver on the promise of exascale science.To accomplish its mission and address those challenges, the ASCR programis organized into two subprograms: Mathematical, Computational, and ComputerSciences Research; and High Performance Computing and Network Facilities.• The Mathematical, Computational, and Computer Sciences Researchsubprogram develops mathematical descriptions, models, methods, andalgorithms to describe and understand complex systems, often involvingprocesses that span a wide range of time and/or length scales. The subprogramalso develops the software to make effective use of advanced networks andcomputers, many of which contain thousands of multi-core processors withcomplicated interconnections, and to transform enormous data sets fromexperiments and simulations into scientific insight.• The High Performance Computing and Network Facilities subprogramdelivers forefront computational and networking capabilities and contributesto the development of next-generation capabilities through support ofprototypes and testbeds.Berkeley Lab thanks the program managers with direct responsibilityfor the NERSC program and the research projects described in this report:Daniel HitchcockActing Associate Director, ASCRBarbara HellandSenior AdvisorWalter M. PolanskySenior AdvisorJulie ScottFinancial Management SpecialistLori JerniganProgram Support SpecialistMelea BakerAdministrative SpecialistRobert LindsaySpecial Projects
94 NERSC Annual Report 2010-2011Appendix D (continued)Office of Advanced Scientific Computing ResearchFacilities DivisionDaniel HitchcockDivision DirectorVincent DattoriaComputer ScientistDave GoodwinPhysical ScientistBarbara HellandComputer ScientistSally McPhersonProgram AssistantBetsy RileyComputer ScientistYukiko SekineComputer ScientistComputational ScienceResearch and Partnerships(SciDAC) DivisionWilliam HarrodDivision DirectorTeresa BeachleyProgram AssistantRich CarlsonComputer ScientistBenjy GroverComputer Scientist (Detailee)Barbara HellandComputer ScientistSandy LandsbergMathematicianRandall LaviolettePhysical ScientistSteven LeePhysical ScientistLenore MullinComputer ScientistThomas Ndousse-FetterComputer ScientistLucy NowellComputer ScientistKaren PaoMathematicianSonia SachsComputer ScientistCeren SusutPhysical ScientistAngie ThevenotProgram AssistantChristine ChalkPhysical Scientist
Appendices95Appendix EAdvanced Scientific Computing Advisory CommitteeThe Advanced Scientific Computing Advisory Committee (ASCAC) providesvaluable, independent advice to the Department of Energy on a variety ofcomplex scientific and technical issues related to its Advanced ScientificComputing Research program. ASCAC’s recommendations include adviceon long-range plans, priorities, and strategies to address more effectively thescientific aspects of advanced scientific computing including the relationshipof advanced scientific computing to other scientific disciplines, and maintainingappropriate balance among elements of the program. The Committee formallyreports to the Director, Office of Science. The Committee primarily includesrepresentatives of universities, national laboratories, and industries involvedin advanced computing research. Particular attention is paid to obtaining adiverse membership with a balance among scientific disciplines, institutions,and geographic regions.Roscoe C. Giles, ChairBoston UniversityMarsha BergerCourant Institute of MathematicalSciencesJackie ChenSandia National LaboratoriesJack J. DongarraUniversity of TennesseeSusan L. GrahamUniversity of California, BerkeleyJames J. HackOak Ridge National LaboratoryAnthony HeyMicrosoft ResearchThomas A. ManteuffelUniversity of Colorado at BoulderJohn NegeleMassachusetts Institute of TechnologyLinda R. PetzoldUniversity of California, Santa BarbaraVivek SarkarRice UniversityLarry SmarrUniversity of California, San DiegoWilliam M. TangPrinceton Plasma Physics LaboratoryVictoria WhiteFermi National Accelerator Laboratory
96 NERSC Annual Report 2010-2011Appendix FAcronyms and AbbreviationsACMACSAIPALCFAMRAPIAssociation for Computing MachineryAmerican Chemical SocietyAmerican Institute of PhysicsArgonne Leadership Computing FacilityAdaptive mesh refinementApplication programming interfaceCCSECDUCH 3OHCH 4CNRSCenter for Computational Sciencesand Engineering (LBNL)Cooling distribution unitMethanolMethaneCentre National de la RechercheScientifique, FranceAPSAmerican Physical SocietyCNTCarbon nanotubeARPA-EARRAAdvanced Research ProjectsAgency-Energy (DOE)American Recovery andReinvestment ActCO 2COECPUCarbon dioxideCenter of ExcellenceCentral processing unitASCROffice of Advanced Scientific ComputingResearch (DOE)CRDComputational Research Division,Lawrence Berkeley National LaboratoryBERBESOffice of Biological and EnvironmentalResearch (DOE)Office of Basic Energy Sciences (DOE)CSCSCSIRSwiss National Supercomputing CentreCouncil of Scientific and IndustrialResearch, IndiaBNLBrookhaven National LaboratoryCUDACompute Unified Device ArchitectureC 2H 5OHEthanolCXICoherent X-ray imagingCAFCo-Array FortranCXIDBCoherent X-Ray Imaging Data BankCASChinese Academy of SciencesDAEDepartment of Atomic Energy, IndiaCATHCCNIClass, Architecture, Topologyand HomologyComputational Center forNanotechnology InnovationsDFTDNADOEDensity functional theoryDeoxyribonucleic acidU.S. Department of Energy
Appendices97DOSDensity of stateHEPOffice of High Energy Physics (DOE)DSTDVSEFRCELMEPSRCDepartment of Science andTechnology, IndiaData Virtualization ServicesEnergy Frontier Research CenterEdge localized modeEngineering and Physical SciencesResearch Council, UKHMAHPCHPSSHTMLHTTPI/OHighly mismatched alloyHigh performance computingHigh Performance Storage SystemHypertext Markup LanguageHypertext Transfer ProtocolInput/outputESGESnetEarth System GridEnergy Sciences NetworkFAPESP–CNPqFundação de Amparo à Pesquisado Estado de São Paulo–ConselhoNacional de DesenvolvimentoCientífico e Tecnológico, BrazilFESFOMGAOffice of Fusion EnergySciences (DOE)Fundamenteel Onderzoek derMaterie, NetherlandsGrant Agency of the Czech RepublicICCSICIAMIEEEIN2P3INCITEIPCCInternational Center forComputational ScienceInternational Council for Industrialand Applied MathematicsInstitute of Electrical andElectronics EngineersInstitut National de Physique Nucléaireet Physique des Particules, FranceInnovative and Novel ComputationalImpact on Theory and Experiment (DOE)Intergovernmental Programon Climate ChangeGBGigabyteITInformation technologyGflop/sGHzGigaflop/sGigahertzITERLatin for “the way”; an internationalfusion energy experiment insouthern FranceGNRGraphene nanoribbonJGIJoint Genome Institute (DOE)GPFSGPGPUGPUH 2HCOGeneral Parallel File SystemGeneral-purpose graphicsprocessing unitGraphics processing unitHydrogenHydrogen carbonate (transition state)KAUSTKISTIKRFLBNLLEDKing Abdullah University of Scienceand TechnologyKorea Institute of Scienceand Technology InformationKorea Research FoundationLawrence Berkeley National LaboratoryLight-emitting diode
98 NERSC Annual Report 2010-2011LLNLLRZMBMESRFMHDMITLawrence Livermore National LaboratoryLeibniz Supercomputing CentreMegabyteMinistry of Education and Scienceof the Russian FederationMagnetohydrodynamicMassachusetts Institute of TechnologyNICSNIHNISENMRNNSFCNational Institute for ComputationalScience, University of TennesseeNational Institutes of HealthNERSC Initiative forScientific ExplorationNuclear magnetic resonanceNational Natural ScienceFoundation of ChinaMo 6S 8Molybdenum sulfide nanoclusterNO 2Nitrogen dioxideMoEMOMMoS 2MoSTMinistry of Education of the People’sRepublic of ChinaMachine-oriented miniserverMolybdenum disulfideMinistry of Science and Technologyof the People´s Republic of ChinaNOWNPNSFNUGNederlandse Organisatie voorWetenschappelijk Onderzoek,NetherlandsOffice of Nuclear Physics (DOE)National Science FoundationNERSC Users GroupMPIMessage Passing InterfaceO 2OxygenMPPMassively parallel processingO 3OzoneMSESMSMTMinistry of Science, Education andSports of the Republic of CroatiaMinisterstvo Školství, Mládežea Tělovýchovy, Czech RepublicOSCOSGOSUOhio Supercomputer CenterOpen Science GridOhio State UniversityNASANCARNCSANERSCNEWTNational Aeronautics andSpace AdministrationNational Center forAtmospheric ResearchNational Center forSupercomputing ApplicationsNational Energy ResearchScientific Computing CenterNERSC Web ToolkitPBPDBPDSFPECASEPetaflopsPetabyteProtein Data BankParallel Distributed SystemsFacility (NERSC)Presidential Early CareerAward for Scientists andEngineersQuadrillions of floating pointoperations per secondNGFNERSC Global FilesystemPflopsPetaflops
Appendices99PGIPortland Group, Inc.TACCTexas Advanced Computing CenterPMAMRPorous Media AMR (computer code)TBTerabytePMSHEPNASPPPLPRACEPolish Ministry of Science andHigher EducationProceedings of the National Academyof SciencesPrinceton Plasma Physics LaboratoryPartnership for Advanced Computingin EuropeTCPTeraflopsTIGRESSTransmission Control ProtocolTrillions of floating point operationsper secondTerascale Infrastructure forGroundbreaking Research inEngineering and Science HighPerformance ComputingCenter, Princeton UniversityPTFPalomar Transient FactoryTPCTime projection chamberQMCQuantum Monte CarloUCUniversity of CaliforniaRCFRHIC Computing FacilityUCBUniversity of California, BerkeleyRESTRHICRMSTRepresentational state transferRelativistic Heavy Ion ColliderRussian Ministry of Scienceand TechnologyUPCURLUVUnified Parallel C(programming language)Uniform Resource LocatorUltravioletROSATOMState Atomic Energy Corporation, RussiaSCOffice of Science (DOE)SciDACScientific Discovery throughAdvanced Computing (DOE)SCOPStructural Classification of ProteinsSDNScience Data Network (ESnet)SIAMSociety for Industrial andApplied MathematicsSLIRPStructural Library of IntrinsicResidue PropensitiesSMPSymmetric multiprocessorSTFCScience and Technology FacilitiesCouncil, UKSUNYState University of New York
For more information about NERSC, contact:Jon BashorNERSC Communicationsemail email@example.comBerkeley Lab, MS 50B4230 phone (510) 486-58491 Cyclotron Road fax (510) 486-4300Berkeley, CA 94720-8148NERSC’s web sitehttp://www.nersc.gov/DISCLAIMERThis document was prepared as an accountof work sponsored by the United StatesGovernment. While this document is believedto contain correct information, neither theUnited States Government nor any agencythereof, nor The Regents of the University ofCalifornia, nor any of their employees, makesany warranty, express or implied, or assumesany legal responsibility for the accuracy,completeness, or usefulness of any information,apparatus, product, or process disclosed, orrepresents that its use would not infringeprivately owned rights. Reference herein to anyspecific commercial product, process, or serviceby its trade name, trademark, manufacturer,or otherwise, does not necessarily constituteor imply its endorsement, recommendation,or favoring by the United States Governmentor any agency thereof, or The Regents of theUniversity of California. The views and opinionsof authors expressed herein do not necessarilystate or reflect those of the United StatesGovernment or any agency thereof, or TheRegents of the University of California.Ernest Orlando Lawrence Berkeley NationalLaboratory is an equal opportunity employer.NERSC Annual Report EditorJohn HulesContributing writersJohn Hules, Jon Bashor, Linda Vu, and Margie Wylie, Berkeley Lab ComputingSciences Communications Group; Paul Preuss and Lynn Yarris, Berkeley LabPublic Affairs Department; David L. Chandler, MIT News Office; Pam FrostGorder, Ohio State University; Kitta MacPherson, Princeton UniversityCommunications Office; Rachel Shafer, UC Berkeley Engineering Marketing andCommunications Office; Karen McNulty Walsh, Brookhaven National Laboratory.Design, layout, illustrations, photography, and printing coordination provided bythe Creative Services Office of Berkeley Lab’s Public Affairs Department.CSO 20748
Matthew AndrewsKatie Antypas ∙ Brian Austin ∙ Clayton BagwellWilliam Baird ∙ Nicholas Balthaser ∙ Jon Bashor ∙ Elizabeth BautistaDel Black ∙ Jeremy Brand ∙ Jeffrey Broughton ∙ Gregory Butler ∙ Tina ButlerDavid Camp ∙ Scott Campbell ∙ Shane Canon ∙ Nicholas Cardo ∙ Jonathan Carter ∙ Steve ChanRavi Cheema ∙ Hank Childs ∙ Shreyas Cholia ∙ Roxanne Clark ∙ James Craw ∙ Thomas DavisTina Declerck ∙ David Donofrio ∙ Brent Draney ∙ Matt Dunford ∙ Norma Early ∙ Kirsten Fagnan ∙ Aaron GarrettRichard Gerber ∙ Annette Greiner ∙ Jeff Grounds ∙ Leah Gutierrez ∙ Patrick Hajek ∙ Matthew HarveyDamian Hazen ∙ Helen He ∙ Mark Heer ∙ Jason Hick ∙ Eric Hjort ∙ John Hules ∙ Wayne Hurlbert ∙ John HutchingsChristos Kavouklis ∙ Randy Kersnick ∙ Jihan Kim ∙ Alice Koniges ∙ Yu Lok Lam ∙ Thomas Langley ∙ Craig LantStefan Lasiewski ∙ Jason Lee ∙ Rei Chi Lee ∙ Bobby Liu ∙ Fred Loebl ∙ Burlen Loring ∙ Steven Lowe ∙ Filipe MaiaIlya Malinov ∙ Jim Mellander ∙ Joerg Meyer ∙ Praveen Narayanan ∙ Robert Neylan ∙ Trever NightingalePeter Nugent ∙ Isaac Ovadia ∙ R. K. Owen ∙ Viraj Paropkari ∙ David Paul ∙ Lawrence Pezzaglia ∙ Jeff PorterPrabhat ∙ Robert Preissl ∙ Tony Quan ∙ Lavanya Ramakrishnan ∙ Lynn Rippe ∙ Cody RotermundOliver Ruebel ∙ Iwona Sakrejda ∙ John Shalf ∙ Hongzhang Shan ∙ Hemant Shukla ∙ David SkinnerJay Srinivasan ∙ Suzanne Stevenson ∙ David Stewart ∙ Michael Stewart ∙ David Turner ∙ Alex UbungenAndrew Uselton ∙ Daniela Ushizima ∙ Francesca Verdier ∙ Linda Vu ∙ Howard WalterHarvey Wasserman ∙ Gunther Weber ∙ Mike Welcome ∙ Nancy Wenning ∙ Cary WhitneyNicholas Wright ∙ Margie Wylie ∙ Woo-Sun Yang ∙ Yushu YaoKatherine Yelick ∙ Rebeccca Yuan ∙ Brian YumaeZhengji Zhao