12.07.2015 Views

ILP-Based Scheduling with Time and Resource Constraints in High ...

ILP-Based Scheduling with Time and Resource Constraints in High ...

ILP-Based Scheduling with Time and Resource Constraints in High ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Step 1 (<strong>in</strong>itialization):p k st = number of operators of type k<strong>with</strong> [ASAP, ALAP] =[s; t ]Ps;t k =0Step 2:for s = jSj :::1 dotmp t =0for t = s:::jSj dotmp t = p s;t + tmp tPs;t k = Ps;t,1 k + tmp tn ? = k maxfn? ; d P s;tk t,s+1 egendendFigure 2: Algorithm LBND1operator to exactly one control step <strong>with</strong><strong>in</strong> its schedule<strong>in</strong>terval. Our problem is P to nd the FA that requiresthe m<strong>in</strong>imum FU area, a k2K kn k . It can be easilyseen that each n k can be m<strong>in</strong>imized <strong>in</strong>dependently, becausethe operators can be executed on only one typeof FU <strong>and</strong> the precedence relation between operatorshave been relaxed.The algorithm LBND1, presented above, can be usedto compute the m<strong>in</strong>imum values of n k ;k2 K. Let p k stbe the number of operators <strong>with</strong> ASAP time s <strong>and</strong>ALAP time t. We dene another quantity Ps;t k to denotethe number of operators whose ASAP <strong>and</strong> ALAPtimes are <strong>with</strong><strong>in</strong> the closed <strong>in</strong>terval [s; t ]. The valuesof p k st can be computed while nd<strong>in</strong>g the ASAP <strong>and</strong>ALAP schedules. The concentration of operators <strong>in</strong><strong>in</strong>terval [s; t] is <strong>in</strong>dicated by Ps;t=(t k , s + 1). It will beshown that the m<strong>in</strong>imum number of FU's, n ? k, is givenby the maximum operator concentration of all the <strong>in</strong>tervals.The algorithm to compute the values of Ps;tk<strong>and</strong> n ? kis presented <strong>in</strong> Figure 2.To see how the algorithm works, consider the data-ow graph <strong>in</strong> Figure 3 (a). The schedule <strong>in</strong>terval ofeach operator for a total schedule length of 4 controlsteps is shown <strong>in</strong> Figure 3 (b). The values of p k stfor thedata ow graph are given <strong>in</strong> Figure 3 (c), <strong>and</strong> the correspond<strong>in</strong>gvalues of Ps;t k are given <strong>in</strong> Figure 3 (d). Maximumoperator concentration occurs <strong>in</strong> the shaded box,<strong>and</strong> the correspond<strong>in</strong>g value of n ? 10kis found as d3 e =4.Although the algorithm is <strong>in</strong>tuitively plausible, thecorrectness proof is somewhat long, <strong>and</strong> will be omitted<strong>in</strong> the <strong>in</strong>terest of space. From Figure 2 it can be easilyseen that the complexity of LBND1 is <strong>in</strong>dependent ofthe number of operators, <strong>and</strong> is given by O(jSj 2 ), wherejSj is the numberofcontrol steps.4 Analysis of the Structure of the TRCSProblemThe previous section has shown how wecovert an <strong>in</strong>stanceof the TCS problem to a TRCS (time- <strong>and</strong>resource-constra<strong>in</strong>ed schedul<strong>in</strong>g) problem by us<strong>in</strong>g thealgorithm LBND1. In the rest of the paper we willconsider the <strong>ILP</strong> formulation of TRCS, for which bothn k <strong>and</strong> S have been specied. The eciency of ansctbf g h i1234k(a)1 2 3 41 11 4 20 21(c)daejlscontrol stepoperatorabcde fgh i j k l1234t1234(b)1 2 3 4t-s+11 3 7 121 501031Figure 3: Execution of algorithm LBND1. (a) Data ow graph(b) Schedule <strong>in</strong>tervals of operators (c) Values of p st (d) Valuesof P st . Maximum operator concentration occurs <strong>in</strong> the <strong>in</strong>terval[2,4] as <strong>in</strong>dicated by the shaded box.<strong>ILP</strong> algorithm depends on how tightly we can deneP I (Q) <strong>with</strong>out us<strong>in</strong>g the <strong>in</strong>tegrality constra<strong>in</strong>ts. InSection 2, we rst dened P F (Q) <strong>in</strong> terms of the assignment,precedence, <strong>and</strong> resource constra<strong>in</strong>ts, <strong>and</strong> thenobta<strong>in</strong>ed P I (Q) by add<strong>in</strong>g the <strong>in</strong>tegrality constra<strong>in</strong>ts.The purpose of this section is to exam<strong>in</strong>e how closeP F (Q) istoP I (Q). Although a thorough exam<strong>in</strong>ationis as hard as solv<strong>in</strong>g the schedul<strong>in</strong>g problem itself, wecan get some useful <strong>in</strong>formation by selectively dropp<strong>in</strong>gsome of the constra<strong>in</strong>ts.First we drop the precedence constra<strong>in</strong>ts, <strong>and</strong>consider the subset of P F (Q), called the resourceassignmentpolytope P F (R), that satisfy the resource<strong>and</strong> the assignment constra<strong>in</strong>ts, <strong>and</strong> is described as:P F (R) =fx 2 R jV j+ j M a x =1; M r x ngNext we drop the resource constra<strong>in</strong>ts <strong>and</strong> consider thesubset of P F (Q) called the precedence-assignment polytope,that satisfy the assignment <strong>and</strong> precedence constra<strong>in</strong>ts,<strong>and</strong> is described as:P F (N )=fx 2 R jV j+ j M a x =1; M p x 1gWe can show that the polytopes P F (R) <strong>and</strong> P F (N ) are<strong>in</strong>tegral polytopes. The proofs of these properties <strong>in</strong>volveextensive use of polyhedral theory <strong>and</strong> graph theory,<strong>and</strong> are given <strong>in</strong> [1]. The signicance of the theseresults is that, as long as the resource constra<strong>in</strong>ts <strong>and</strong>the precedence constra<strong>in</strong>ts are considered <strong>in</strong>dependentof each other, the constra<strong>in</strong>ts presented <strong>in</strong> our formulationare the tightest constra<strong>in</strong>ts possible.The orig<strong>in</strong>al schedul<strong>in</strong>g polytope P F (Q) is the <strong>in</strong>tersectiontwo <strong>in</strong>tegral polytopes P F (R) <strong>and</strong> P F (I).However, this does not necessarily imply P F (Q) is<strong>in</strong>tegral.It can be easily demonstrated <strong>with</strong> a counterexample[1] that P F (Q) can have fractional extremepo<strong>in</strong>ts (i.e. P I (Q) P F (Q)), so an LP-relaxation ofthe problem could lead to fractional solutions, <strong>and</strong> wewill have to use branch-<strong>and</strong>-bound to nd the <strong>in</strong>tegraloptimal solution. In order for the branch-<strong>and</strong>-boundapproach to be successful, it is important to nd a(d)43213


sharp bound on the objective function, so that branchescan be pruned eciently.The structure of P F (Q) presented above can be <strong>in</strong>terpretedus<strong>in</strong>g duality theory [6] to prove that thebounds produced by the LP-relaxation are as goodas the bounds from the Lagrangian relaxation. Lagrangianbounds are tight <strong>and</strong> have led to the successof other comb<strong>in</strong>atorial optimization problems. Suchtight bounds <strong>in</strong>crease the likelihood that the optimumsolution can be found <strong>in</strong> a small number of branches,as will be illustrated through experimental results.In order to further improve the formulation we haveto tighten the description of P F (Q) so that it approximatesP I (Q) more closely. This can be done by <strong>in</strong>troduc<strong>in</strong>gnew valid <strong>in</strong>equalities which take <strong>in</strong>to accountthe eect of the precedence <strong>and</strong> resource constra<strong>in</strong>tsupon one another. We will present a class of valid <strong>in</strong>equalities<strong>in</strong> the follow<strong>in</strong>g:Valid Inequality Let jx Vk;s j n k be a resourceconstra<strong>in</strong>t S of Q. Consider a m<strong>in</strong>imal clique coverpV k;s = l=1 V l where each V l represents a clique madeby precedence edges. If, for each v 2 V k;s , p v gives thenumber of cliques that conta<strong>in</strong> v, then the follow<strong>in</strong>g expressionis a valid <strong>in</strong>equality X of Q,c v x v n k (3)v2Vk;swhere c v = maxf1;n k + p v , pg5 ResultsThe analysis of the <strong>ILP</strong> formulation presented <strong>in</strong> theprevious section provides us <strong>with</strong> a theoretical groundto expect optimal solutions <strong>in</strong> a relatively few numberof branches. In this section we will demonstrate thevalidity of this prediction us<strong>in</strong>g two benchmark examples:the 34-operator elliptical wave lter (EWF), <strong>and</strong>the 48-operator discrete cos<strong>in</strong>e transform (DCT).It should be noted here that any <strong>ILP</strong> approach producesoptimal results, so we can not expect our schedulesto be better than other <strong>ILP</strong> solutions. Instead,our objective was to oer a theoretical foundation forevaluat<strong>in</strong>g the <strong>ILP</strong> formulation. Thus for our purposes,we will use the number of branches taken by the <strong>ILP</strong>as the <strong>in</strong>dicator of performance. We will demonstratethat the number of branches are small, as we predicted<strong>in</strong> the previous section.The schedul<strong>in</strong>g results are shown <strong>in</strong> Tables 1 <strong>and</strong> 2;we used an objective function that tries to m<strong>in</strong>imizethe number of registers. First we solved LBND1 to ndlower bound on resources <strong>and</strong> then solved the <strong>ILP</strong> toconstruct the schedule. In a couple of cases, the boundsgiven by LBND1 were too tight for a feasible schedule;<strong>in</strong> those cases we specied a larger number of FU's untila feasible schedule could be found. The \LV" column<strong>in</strong>dicate the maximum number of live variables thatcross a control step boundary.We also solved the TCS problems for the abovebenchmarks to observe their performance. These formulationsare less structured, <strong>and</strong> are expected to requiregreater computation time. For EWF, the TCSproblems could be solved to optimality; however, theytook a larger number of branches. For DCT, the <strong>ILP</strong>solver failed to produce the optimal results <strong>in</strong> somecases even after hundreds of branches. This <strong>in</strong>dicatesNo. of Non-Pipel<strong>in</strong>ed Pipel<strong>in</strong>edcsteps Mult MultTotal Loop ALU Mul LV Branch ALU Mul LV Branch17 17 3 3 10 0 3 2 10 018 18 2 2 9 0 3 1 10 02 2 9 018 16 3 2 10 0 3 1 10 019 19 2 2 9 0 2 1 9 019 17 2 2 9 0 2 1 9 221 21 2 1 9 0 2 1 9 121 19 2 1 9 0 2 1 9 0Table 1: <strong>Schedul<strong>in</strong>g</strong> Results for the Elliptic Wave FilterNo. of Non-Pipel<strong>in</strong>ed Pipel<strong>in</strong>edcsteps Mult MultALU Mul LV Branch ALU Mul LV Branch7 6 5 12 1 6 8 11 18 5 4 12 1 5 6 13 49 4 3 13 2 4 6 13 19 4 4 13 1 5 6 13 09 5 4 12 1 5 7 0Table 2: <strong>Schedul<strong>in</strong>g</strong> Results for the Discrete Cos<strong>in</strong>e TransformExamplethat although any <strong>ILP</strong> formulation theoretically leadsto optimal results, a careful choice should be made <strong>in</strong>order solve it eciently.6 ConclusionIn this paper, we have presented an <strong>ILP</strong> formulationof the schedul<strong>in</strong>g problem, <strong>and</strong> have formally evaluatedthe structure of the formulation <strong>in</strong> the presenceof time <strong>and</strong> resource constra<strong>in</strong>ts. Formal analysis hasbeen performed to <strong>in</strong>dicate that the eciency of the<strong>ILP</strong> formulation on the benchmark examples is notan arbitrary event {wehave given a theoretical basisfor expect<strong>in</strong>g ecient solutions from our <strong>ILP</strong> basedschedul<strong>in</strong>g algorithm. To further <strong>in</strong>crease the eciencyof solv<strong>in</strong>g a TCS problem, a methodology has been presentedto add resource constra<strong>in</strong>ts by optimally solv<strong>in</strong>ga relaxation of TCS.References[1] Samit Chaudhuri, Robert A. Walker, <strong>and</strong> John Mitchell. TheStructure of Assignment, Precedence <strong>and</strong> <strong>Resource</strong> <strong>Constra<strong>in</strong>ts</strong><strong>in</strong> the <strong>ILP</strong> Approach to the <strong>Schedul<strong>in</strong>g</strong> Problem. To appear <strong>in</strong>Proc. of ICCD, 1993.[2] M. R. Garey <strong>and</strong> D. S. Johnson, editors. Computers <strong>and</strong> Intractability:A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.[3] C.H. Gebotys <strong>and</strong> M.I. Elmasry. Simultaneous <strong>Schedul<strong>in</strong>g</strong> <strong>and</strong>Allocation for Cost Constra<strong>in</strong>ed Optimal Architectural Synthesis.In Proc. of 28th DAC, pages 2{7, 1991.[4] Cheng-Tsung Hwang, Jiahn-Hurng Lee, <strong>and</strong> Yu-Ch<strong>in</strong> Hsu. AFormal Approach to the <strong>Schedul<strong>in</strong>g</strong> Problem <strong>in</strong> <strong>High</strong> Level Synthesis.IEEE Trans. on CAD, 10(4):464{475, 1991.[5] G. L. Nemhauser <strong>and</strong> L.A. Wolsey. Optimization, volume 1 ofH<strong>and</strong>books <strong>in</strong> Operations Research <strong>and</strong> Management Science,chapter 6. Elsevier Science Publishers B. V., 1989.[6] G. L. Nemhauser <strong>and</strong> L. A. Wolsey. Integer <strong>and</strong> Comb<strong>in</strong>atorialOptimization. John Wiley & Sons, 1988.[7] H. Sh<strong>in</strong> <strong>and</strong> N. S. Woo. A Cost Function <strong>Based</strong> OptimizationTechnique for <strong>Schedul<strong>in</strong>g</strong> <strong>in</strong> Data Path Synthesis. In Proc. ofICCD, pages 424{427, 1989.[8] J. D. Ullman. NP-Complete <strong>Schedul<strong>in</strong>g</strong> Problems. J. Comput.System Sci, 10(10):384{393, 1975.[9] T. C. Wilson, N. Mukherjee, M. K. Garg, <strong>and</strong> D. K. Banerjee. AnIntegrated <strong>and</strong> Accelerated <strong>ILP</strong> Solution for <strong>Schedul<strong>in</strong>g</strong>, ModuleAllocation, <strong>and</strong> B<strong>in</strong>d<strong>in</strong>g <strong>in</strong> Datapath Synthesis. In 6th InternationalConference on VLSI Design, pages 192{197, Bombay,India, Jan 1993.4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!