Dell Power Solutions

More documents

Recommendations

Info

HIGH-PERFORMANCE COMPUTINGPlatform Load Sharing Facility resource managerPlatform LSF is a popular resource manager for clusters. Its focusis to maximize resource utilization within the constraints of localadministration policies. Platform Computing offers two products:Platform LSF and Platform LSF HPC. LSF is designed to handle abroad range of job types such as batch, parallel, distributed, andinteractive. LSF HPC is optimized for HPC parallel applications byproviding additional facilities for intelligent scheduling, which enablesdifferent QoS in different queues. LSF also implements a hierarchicalfair-share scheduling algorithm to balance resources among usersunder all load conditions.Platform LSF has built-in schedulers that implement advancedscheduling algorithms to provide easy configurability and high reliabilityfor users. In addition to basic scheduling algorithms, Platform LSFuses advanced techniques like advance reservation and backfill.Platform LSF and Platform LSF HPC both have a dynamicscheduling decision mechanism. The scheduling decisions underthis mechanism are based on processing load. Based on these decisions,jobs can be migrated among compute nodes or rescheduled.Loads can also be balanced among compute nodes in heterogeneousenvironments. These features make Platform LSF suitable for abroad range of HPC applications. In addition, Platform LSF candynamically migrate jobs among compute nodes. Platform LSF canalso have multiple scheduling algorithms applied to different queuessimultaneously. Platform LSF HPC can make intelligent schedulingdecisions based on the features of advanced interconnect networks,thus enhancing process mapping for parallel applications.The term resourceehas a broad definition in Platform LSF andPlatform LSF HPC. Resources can be CPUs, memory, storage space,or software licenses. (In some sectors, software licenses are expensiveand are considered a valuable resource.)Platform LSF and Platform LSF HPC each have an extensiveadvance reservation system that can reserve different kinds ofresources. In some distributed applications, many instances ofthe same application are required to perform parametric studies.Platform LSF and Platform LSF HPC allow users to submit a jobgroup that can contain a large number of jobs, making parametricstudies much easier to manage.Platform LSF and Platform LSF HPC can interface with externalschedulers such as Maui. External schedulers can complement featuresof the resource manager and enable sophisticated scheduling.For example, using Platform LSF HPC, the hierarchical fair-sharealgorithm can dynamically adjust priorities and feed these prioritiesto Maui for use in scheduling decisions.Maui job schedulerMaui is an advanced open source job scheduler that is specificallydesigned to optimize system utilization in policy-driven, hetero geneousHPC environments. Its focus is on fast turnaround of large parallel jobs,making the Maui scheduler highly suitable for HPC. Maui can workwith several common resource managers including Platform LSF andPlatform LSF HPC, and potentially improve scheduling performancecompared to built-in schedulers.Maui has a two-phase scheduling algorithm. During the firstphase, the high-priority jobs are scheduled using advance reservation.In the second phase, a backfill algorithm is used to schedulelow-priority jobs between previously scheduled jobs. Maui uses thefair-share technique when making scheduling decisions based onjob history. Note: Maui’s internal behavior is based on a single, unifiedqueue. This maximizes the opportunity to utilize resources.Typically, users are guaranteed certain QoS, but Maui givesa significant amount of control to administrators—allowing localpolicies to control access to resources, especially for scheduling.For example, administrators can enable different QoS and accesslevels to users and jobs, which can be preemptively identified. Mauiuses a tool called QBank for allocation management. QBank allowsmultisite control over the use of resources. Another Maui featureallows charge rates (the amount users pay for compute resources)to be based on QoS, resources, and time of day. Maui is scalableto thousands of jobs, despite its nondistributed scheduler daemon,which is centralized and runs on a single node.Maui supports job preemption, which can occur under severalconditions. High-priority jobs can preempt lower-priority or backfilljobs if resources to run the high-priority jobs are not available. Insome cases, resources reserved for high-priority jobs can be used torun low-priority jobs when no high-priority jobs are in the queue.However, when high-priority jobs are submitted, these low-priorityjobs can be preempted to reclaim resources for high-priority jobs.Maui has a simulation mode that can be used to evaluate theeffect of queuing parameters on the scheduler performance. Becauseeach HPC environment has a unique job profile, the parameters ofthe queues and scheduler can be tuned based on historical logs tomaximize scheduler performance.Satisfying ever-increasing computing demandsAs cluster sizes scale to satisfy growing computing needs in variousindustries as well as in academia, advanced schedulers can help maximizeresource utilization and QoS. The profile of jobs, the nature ofcomputation performed by the jobs, and the number of jobs submittedcan help determine the benefits of using advanced schedulers.Saeed Iqbal, Ph.D., is a systems engineer and advisor in the Scalable Systems Group atDell. He has a Ph.D. in Computer Engineering from The University of Texas at Austin.Rinku Gupta is a systems engineer and advisor in the Scalable Systems Group at Dell. Shehas a B.E. in Computer Engineering from Mumbai University in India and an M.S. in ComputerInformation Science from The Ohio State University.Yung-Chin Fang is a senior consultant in the Scalable Systems Group at Dell. He specializesin cyberinfrastructure management and high-performance computing.136POWER SOLUTIONS Reprinted from Dell Power Solutions, February 2005. Copyright © 2005 Dell Inc. All rights reserved. February 2005
HIGH-PERFORMANCE COMPUTINGUnderstanding the Scalability ofNWChem in HPC EnvironmentsDell PowerEdge servers can provide a suitable platform for deployment of applicationsthat have been designed for efficient scaling on parallel systems. NWChem, acompute-intensive computational chemistry package, is one such application that canbenefit from the performance and computational power provided by high-performancecomputing (HPC) clusters. This article introduces NWChem and explains the advantagesof running NWChem on HPC clusters versus a single node.BY MUNIRA HUSSAIN; RAMESH RADHAKRISHNAN, PH.D.; AND KALYANA CHADALAVADAClusters of standards-based computer systems havebecome a popular choice for building cost-effective,high-performance parallel computing platforms. Highperformancecomputing (HPC) clusters typically consistof a set of symmetric multiprocessing (SMP) systemsconnected with a high-speed network interconnect intoa single computational unit. The rapid advancement ofmicroprocessor technologies and high-speed interconnectshas facilitated many successful deployments of HPC clusters.HPC technology is now employed in many domains,including scientific computing applications such as weathermodeling and fluid dynamics as well as commercial applicationssuch as financial modeling and 3-D imaging.HPC is traditionally associated with RISC-based systems.However, expensive, proprietary RISC-based systems canbe difficult to afford for small-scale research establishmentsand academic institutions on tight budgets. Building anHPC cluster with standards-based Intel ® processors suchas 32-bit Intel Xeon processors or 64-bit Intel Itanium®processors and standards-based, off-the-shelf componentscan offer several advantages—including optimal performance,minimal costs, and freedom to mix and matchtechnologies for an excellent price/performance ratio. Inaddition, HPC clusters built from industry-standard componentsare tolerant to component failures because theyhave no single point of failure, thus enhancing systemavailability and reliability.NWChem, 1 an open source application developed andsupported by the Pacific Northwest National Laboratory, isdesigned to use the resources of high-performance parallelsupercomputers as well as HPC clusters built from standardsbasedsystems. The application is built on the concepts ofobject-oriented programming and non-uniform memoryaccess (NUMA). This architecture allows considerable flexibilityin the manipulation and distribution of data on sharedmemory, distributed memory, and massively parallel hardwarearchitectures—and hence, maps well to the architectureof HPC cluster systems.1For more information about NWChem, visit www.emsl.pnl.gov/docs/nwchem/nwchem.html.www.dell.com/powersolutions Reprinted from Dell Power Solutions, February 2005. Copyright © 2005 Dell Inc. All rights reserved. POWER SOLUTIONS 137
Page 1 and 2:
DELL POWER SOLUTIONS • FEBRUARY 2
Page 3 and 4:
POWERSOLUTIONSTHE MAGAZINE FOR DIRE
Page 5 and 6:
© 2005 Quantum Corporation. All ri
Page 7 and 8:
Dave is on vacation.He’s been not
Page 9 and 10:
UTILITY=AVAILABILITY.From SAP to BE
Page 11 and 12:
EXECUTIVE INSIGHTSleading third-par
Page 13 and 14:
NEW-GENERATION SERVER TECHNOLOGYis
Page 15 and 16:
NEW-GENERATION SERVER TECHNOLOGYAno
Page 17 and 18:
NEW-GENERATION SERVER TECHNOLOGYThe
Page 19 and 20:
The industry’s preeminent source
Page 21 and 22:
NEW-GENERATION SERVER TECHNOLOGYI/O
Page 23 and 24:
Will yours be there when you need i
Page 25 and 26:
NEW-GENERATION SERVER TECHNOLOGYas
Page 27 and 28:
NEW-GENERATION SERVER TECHNOLOGYThe
Page 29 and 30:
NEW-GENERATION SERVER TECHNOLOGYpor
Page 31 and 32:
More data? Less time?No problem.Del
Page 33 and 34:
NEW-GENERATION SERVER TECHNOLOGYser
Page 35:
NEW-GENERATION SERVER TECHNOLOGYDTK
Page 38 and 39:
NEW-GENERATION SERVER TECHNOLOGYpac
Page 40 and 41:
NEW-GENERATION SERVER TECHNOLOGYman
Page 42 and 43:
NEW-GENERATION SERVER TECHNOLOGYFig
Page 44 and 45:
NEW-GENERATION SERVER TECHNOLOGYSun
Page 46 and 47:
NEW-GENERATION SERVER TECHNOLOGYTab
Page 48 and 49:
NEW-GENERATION SERVER TECHNOLOGYMan
Page 50 and 51:
Page 52 and 53:
Page 54 and 55:
SYSTEMS MANAGEMENTsuch as:Dell Upda
Page 56 and 57:
SYSTEMS MANAGEMENTThis hardware-cen
Page 58 and 59:
SYSTEMS MANAGEMENTCLI taskFigure 4.
Page 60 and 61:
SYSTEMS MANAGEMENTManaging Dell Cli
Page 62 and 63:
SYSTEMS MANAGEMENTFigure 2. OMCA De
Page 64 and 65:
SYSTEMS MANAGEMENTAgentless Monitor
Page 66 and 67:
SYSTEMS MANAGEMENTWeb serverGlobal
Page 68 and 69:
SYSTEMS MANAGEMENTorganizations can
Page 70 and 71:
STORAGE• Multi-staged disk backup
Page 72 and 73:
STORAGEExec Advanced Disk-Based Bac
Page 74 and 75:
STORAGEPrimary disk (RAID)Figure 1.
Page 76 and 77:
STORAGESTORAGEFREESubscriptionReque
Page 78 and 79:
STORAGEcentralized backup can offer
Page 80 and 81:
STORAGEoccurs. Replication can be s
Page 82 and 83:
STORAGEBackup concepts, while firml
Page 84 and 85:
SCALABLE ENTERPRISEFile Systemsfor
Page 86 and 87:
SCALABLE ENTERPRISEServer 1 Server
Page 88 and 89: SCALABLE ENTERPRISEThe Promise ofUn
Page 90 and 91: SCALABLE ENTERPRISELANExternalcommu
Page 92 and 93: SCALABLE ENTERPRISErequired for dif
Page 94 and 95: SCALABLE ENTERPRISEFigure 1 shows v
Page 96 and 97: SCALABLE ENTERPRISEDeploying and Ma
Page 98 and 99: SCALABLE ENTERPRISEapplication serv
Page 100 and 101: SCALABLE ENTERPRISEExploitingAutoma
Page 102 and 103: SCALABLE ENTERPRISEFigure 2. Perfor
Page 104 and 105: SCALABLE ENTERPRISEMigrating Oracle
Page 106 and 107: SCALABLE ENTERPRISEexport and impor
Page 108 and 109: SCALABLE ENTERPRISE8> 'm:\expdata\o
Page 110 and 111: SCALABLE ENTERPRISEClientsPublic LA
Page 112 and 113: SCALABLE ENTERPRISEnodes, the clust
Page 114 and 115: HIGH-PERFORMANCE COMPUTING(Red Hat
Page 116 and 117: HIGH-PERFORMANCE COMPUTINGFor monit
Page 118 and 119: HIGH-PERFORMANCE COMPUTINGin a clus
Page 120 and 121: HIGH-PERFORMANCE COMPUTINGPerforman
Page 122 and 123: HIGH-PERFORMANCE COMPUTINGPowerEdge
Page 124 and 125: HIGH-PERFORMANCE COMPUTING2.50Power
Page 126 and 127: HIGH-PERFORMANCE COMPUTINGApplicati
Page 129 and 130: HIGH-PERFORMANCE COMPUTINGthe incre
Page 131 and 132: HIGH-PERFORMANCE COMPUTINGCompute n
Page 133 and 134: HIGH-PERFORMANCE COMPUTINGIBRIX Fus
Page 135 and 136: HIGH-PERFORMANCE COMPUTINGPlanning
Page 137: HIGH-PERFORMANCE COMPUTINGFeatureCo
Page 141 and 142: HIGH-PERFORMANCE COMPUTINGprovides
Page 143 and 144: Oracle DatabaseWorld’s #1 Databas
show all

Dell Power Solutions

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?