14.10.2013 Views

Scheduling Algorithms for Dynamic Workload

Scheduling Algorithms for Dynamic Workload

Scheduling Algorithms for Dynamic Workload

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Managed by<br />

<strong>Scheduling</strong> <strong>Algorithms</strong><br />

<strong>for</strong> <strong>Dynamic</strong> <strong>Workload</strong><br />

Dalibor Klusáček (MU)<br />

Hana Rudová (MU)<br />

Ranieri Baraglia (CNR - ISTI)<br />

Gabriele Capannini (CNR - ISTI)<br />

Marco Pasquali (CNR – ISTI)


Outline<br />

• Motivation & Problem description<br />

• Applied techniques<br />

Queue-based (Backfilling)<br />

Schedule-based (Dispatching rules & Local Search)<br />

• Simulation toolkit<br />

• Preliminary results<br />

• Future work<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Job <strong>Scheduling</strong> on Computational Grids<br />

• The general job scheduling problem includes:<br />

Selection of a processing resource <strong>for</strong> every job<br />

Selection of a job processing order/time <strong>for</strong> every resource<br />

• Driven by different constraints:<br />

Job QoS requirements (e.g. deadline and sw licenses)<br />

Data/time dependencies between jobs<br />

Processing limitation of resources (e.g. sw licences), etc.<br />

• Objectives:<br />

To optimize the system throughput maximizing the overall<br />

resource utilization<br />

To guarantee a maximum level of per<strong>for</strong>mance required<br />

from applications<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Job <strong>Scheduling</strong> on Computational Grids<br />

• In the past a lot of research ef<strong>for</strong>t was devoted to<br />

understand and develop job scheduling algorithms<br />

(e.g. FCFS, Backfilling, Gang scheduling, etc.)<br />

• Nowadays many of these algorithms are exploited into<br />

commercial and open source job schedulers<br />

• However, none of these scheduler capabilities deal<br />

with an entire wide range of constraints and<br />

requirements (e.g. job’s deadline) presented by the<br />

users<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Examples:<br />

Commercial and Open-source<br />

Cluster Management Software<br />

Maui scheduler<br />

Load Lever<br />

Load sharing facility<br />

Portable Batch System<br />

Sun Grid Engine<br />

FCFS, Backfilling, EASY backfilling<br />

FCFS, backfilling, gang scheduling,<br />

external schedulers<br />

FCFS, fair-share, preemptive, backfilling,<br />

service Level Agreements<br />

FCFS, Shortest Job First, user/group<br />

Priorities, fair-share<br />

FCFS, job priorities and fair-share, migration<br />

support.<br />

Future version: backfilling<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


• Machines<br />

Current Problem<br />

Parallel machines with different number of CPUs<br />

Different machines with different CPU speed<br />

• Jobs<br />

<strong>Dynamic</strong>ally arriving jobs<br />

With/without deadline<br />

Job require >= 1 CPU<br />

Known job-execution time<br />

• Objective function<br />

Maximize number of jobs that meet their deadline<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


• Queues<br />

Queue-based <strong>Algorithms</strong><br />

• Basic methods<br />

Trivial queue-based technique used as a comparison<br />

with advanced techniques<br />

FCFS: First Come First Serve<br />

EDF: Earliest Deadline First<br />

• Backfilling<br />

Easy Backfilling<br />

Flexible Backfilling (future work)<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Backfilling<br />

• Is an optimization of the FCFS algorithm<br />

• Tries to balance the goals of utilization and<br />

maintaining FCFS order.<br />

• Requires that each job also specifies its<br />

maximum execution time. While the job at the<br />

head of the queue is waiting, it is possible <strong>for</strong><br />

other, smaller jobs, to be scheduled, especially if<br />

they would not delay the start of the job on the<br />

head of the queue.<br />

• Several variants of backfilling algorithm were<br />

proposed.<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


<strong>Algorithms</strong> with Global Schedule<br />

• Global schedule<br />

Both resource and time assignment considered<br />

• Dispatching rules<br />

Used <strong>for</strong> initial schedule generation<br />

MTEDF: Minimum Tardiness Earliest Deadline First<br />

• Local Search<br />

Local changes in schedule by movement of jobs<br />

Tabu Search: recent moves prohibited to avoid cycling<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Simulation Toolkit<br />

• GridSim-based simulator<br />

Implemented at MU<br />

http://www.fi.muni.cz/~xklusac/gridsim<br />

• Centralized scheduler<br />

• Our extensions<br />

Simulated scenarios<br />

Static vs. dynamic problems<br />

Sequentional/parallel jobs, chain of jobs<br />

Total tardiness, makespan minimization<br />

Different scheduling techniques<br />

ERD, EDD, MTERD<br />

Tabu search, Hill climbing, Simulated Annealing<br />

• Easy integration<br />

New scenarios, new scheduling techniques<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Current Data Sets and <strong>Algorithms</strong><br />

• 1500 jobs<br />

1-16 CPU<br />

30% no-deadline jobs<br />

• 150 machines (cca 1500 CPUs)<br />

2-16 CPU, different CPU rates<br />

• 7 different types of problem with 20 data sets<br />

Arrival time distribution (high/low frequency)<br />

• FCFS, EDF<br />

• Easy Backfilling<br />

• MTEDF<br />

• Tabu search<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Jobs not meeting the deadline (%)<br />

100<br />

90<br />

80<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

FCFS<br />

Easy-Backfilling<br />

EDF<br />

MTEDF<br />

Tabu Search<br />

FCFS<br />

Easy-Backfilling<br />

EDF<br />

MTEDF<br />

Tabu Search<br />

Preliminary Results<br />

FCFS<br />

Easy-Backfilling<br />

EDF<br />

MTEDF<br />

Tabu Search<br />

Ta-04 Ta-06 Ta-08<br />

Arrival time distribution<br />

FCFS<br />

Easy-Backfilling<br />

EDF<br />

MTEDF<br />

Tabu Search<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies<br />

total scheduling time (ms)<br />

50000<br />

45000<br />

40000<br />

35000<br />

30000<br />

25000<br />

20000<br />

15000<br />

10000<br />

5000<br />

0<br />

FCFS Easy-Backfilling<br />

EDF MTEDF<br />

Tabu Search<br />

Ta-04 Ta-06 Ta-08<br />

Arrival time distribution


Future Work<br />

• Problem extensions<br />

Estimated running time, SW licenses, RAM size<br />

• New algorithms<br />

Flexible backfilling<br />

• Extension of current algorithms<br />

Improvements in the data representation <strong>for</strong> global<br />

schedule<br />

More complex reasoning <strong>for</strong> parallel jobs<br />

Study of local search algorithms <strong>for</strong> dynamic problems<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies


Thank you!<br />

European Research Network on Foundations, Software Infrastructures and Applications <strong>for</strong> large scale distributed, GRID and Peer-to-Peer Technologies

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!