12.07.2015 Views

Dell Power Solutions

Dell Power Solutions

Dell Power Solutions

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

HIGH-PERFORMANCE COMPUTINGFeatureCommentsa. Jobs waitingin the queueBroad scopeThe nature of jobs submitted to a cluster can vary, so thescheduler must support batch, parallel, sequential, distributed,interactive, and noninteractive jobs with similar efficiency.Support for algorithmsThe scheduler should support numerous job-processingalgorithms—including FCFS, FIFO, SJF, LJF, advancereservation, and backfill. In addition, the scheduler should beable to switch between algorithms and apply differentalgorithms at different times—or apply different algorithmsto different queues, or both.b. The queue aftereach priority groupis sorted accordingto execution timeSorted high-priority jobsSorted low-priority jobsCapability to integratewith standard resourcemanagersSensitivity to computenode and interconnectarchitectureScalabilityThe scheduler should be able to interface with the resourcemanager in use, including common resource managers suchas Platform LSF, Sun Grid Engine, and OpenPBS (the original,open source version of Portable Batch System).The scheduler should match the appropriate compute nodearchitecture to the job profile—for example, by using computenodes that have more than one processor to provide optimalperformance for applications that can use the secondprocessor effectively.The scheduler should be capable of scaling to thousands ofnodes and processing thousands of jobs simultaneously.c. Longest job firstscheduled. Longest job firstand backfillscheduleComputenodesComputenodesTimeFair-share capabilityThe scheduler should distribute resources fairly under heavyconditions and at different times.TimeEfficiencyThe overhead associated with scheduling should be minimaland within acceptable limits. Advanced scheduling algorithmscan take time to run. To be efficient, the scheduling algorithmitself must spend less time running than the expected savingin application execution time from improved scheduling.e. Shortest job firstscheduleComputenodesDynamic capabilitySupport for preemptionThe scheduler should be able to add or remove computeresources to a job on the fly—assuming that the job canadjust and utilize the extra compute capacity.Preemption can occur at various levels; for example, jobs maybe suspended while running. Checkpointing—that is, thecapability to stop a running job, save the intermediate results,and restart the job later—can help ensure that results are notlost for very long jobs.f. Shortest job firstand backfillscheduleComputenodesTimeTimeFigure 2. Features of job schedulersFigure 3. Job scheduling algorithmsthe queue in a cyclical, round-robin manner. SJF periodically sortsthe incoming jobs and executes the shortest job first, allowing shortjobs to get a good turnaround time. However, this strategy may causedelays for the execution of long (large) jobs. In contrast, LJF commitsresources to longest jobs first. The LJF approach tends to maximizesystem utilization at the cost of turnaround time.Basic scheduling algorithms such as these can be enhancedby combining them with the use of advance reservation andbackfill techniques. Advance reservation uses execution timepredictions provided by the users to reserve resources (such asCPUs and memory) and to generate a schedule. The backfill techniqueimproves space-sharing scheduling. Given a schedule withadvance-reserved, high-priority jobs and a list of low-priority jobs,a backfill algorithm tries to fit the small jobs into schedulinggaps. This allocation does not alter the sequence of jobs previouslyscheduled, but improves system utilization by running lowpriorityjobs in between high-priority jobs. To use backfill, thescheduler requires a runtime estimate of the small jobs, which issupplied by the user when jobs are submitted.Figure 3 illustrates the use of the basic algorithms and theenhancements discussed in this article. Figure 3a shows a queuewith 11 jobs waiting; the queue has both high-priority and lowpriorityjobs. Figure 3b shows these jobs sorted according to theirestimated execution time.The example in Figure 3 assumes an eight-processor cluster andconsiders only two parameters: the number of processors and theestimated execution time. This figure shows the effects of generatingschedules using the LJF and SJF algorithms with and without backfilltechniques. Sections c through f of Figure 3 indicate that backfill canimprove schedules generated by LJF and SJF, either by increasing utilization,decreasing response time, or both. To generate the schedulesshown, the low- and high-priority jobs are sorted separately.Examining a commercial resource managerand an external job schedulerThis section introduces scheduling features of a commercial resourcemanager, Load Sharing Facility (LSF) from Platform Computing, andan open source job scheduler, Maui.www.dell.com/powersolutions Reprinted from <strong>Dell</strong> <strong>Power</strong> <strong>Solutions</strong>, February 2005. Copyright © 2005 <strong>Dell</strong> Inc. All rights reserved. POWER SOLUTIONS 135

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!