Register Allocation for SSA-Form Programs

Register Allocation for SSA-Form Programs 

Sebastian Hack Daniel Grund 

(hack|daniel)@ipd.info.uni-karlsruhe.de 

Institut für Programmstrukturen und Datenorganisation 

Universität Karlsruhe 

31.01.2006 

Universität 

Karlsruhe 

Sebastian Hack (Universität Karlsruhe) SSA Register Allocation 31.01.2006 1 / 47

Overview 

1 Preliminaries 

Register Allocation 

Universität 

Karlsruhe 



Register Allocation is the task of mapping the program’s variables to 

processor registers 

Issues to be covered: 

Spilling Put variables into memory if there are not enough 

registers 

Coalescing Eliminate unneccessary copies in the program 

Often reduced to graph coloring 

Universität 

Karlsruhe 


Interference Graphs 

Two variables are live at the same label ⇒ they interfere 

Each variable has a node in the interference graph (IG) 

Whenever two variables interfere, there is an edge between the 

corresponding nodes 

(a, b) = start 

if b < a 

c = a − b c = 0 

return c 

c 

a b 

Universität 

Karlsruhe 


Interference Graphs 

Two variables are live at the same label ⇒ they interfere 

Each variable has a node in the interference graph (IG) 

Whenever two variables interfere, there is an edge between the 

corresponding nodes 


if b < a 

c = a − b c = 0 

return c 

c 

a b 

Coloring gives register allocation 

Universität 

Karlsruhe 


Chaitin/Briggs Register Allocator 

Spill 

Build Coalesce Color 

not k-colorable 

Every undirected graph can occur as an interference graph 

Determining chromatic number is N P-complete 

Color using heuristic ⇒ Iteration necessary 

Spilling is focused on the graph 

Universität 

Karlsruhe 


Coloring 

Subsequently remove the nodes from the graph 

d e 

a b 

c 

elimination order 

Universität 

Karlsruhe 


Coloring 


d e 

a b 

c 


d, 

Universität 

Karlsruhe 


Coloring 


d e 

a b 

c 


d, e, 

Universität 

Karlsruhe 


Coloring 


d e 

a b 

c 


d, e, c, 

Universität 

Karlsruhe 


Coloring 


d e 

a b 

c 


d, e, c, a, 

Universität 

Karlsruhe 


Coloring 


d e 

a b 

c 


d, e, c, a, b 

Universität 

Karlsruhe 


Coloring 


Re-insert the nodes in reverse order 

Assign each node the next possible color 

d e 

a b 

c 


d, e, c, a, b 

Universität 

Karlsruhe 


Coloring 




d e 

a b 

c 


d, e, c, a, 

Universität 

Karlsruhe 


Coloring 




d e 

a b 

c 


d, e, c, 

Universität 

Karlsruhe 


Coloring 




d e 

a b 

c 


d, e, 

Universität 

Karlsruhe 


Coloring 




d e 

a b 

c 


d, 

Universität 

Karlsruhe 


Coloring 




d e 

a b 

c 


Universität 

Karlsruhe 


Coloring 




d e 

a b 

c 


Theorem 

For each graph there is an elimination order leading to an optimal coloring 

Universität 

Karlsruhe 


Perfect Elimination Orders 

Suppose all (not yet eliminated) neighbors of a node n form a clique 

d e 

a b 

c 


a, c, d, e, b 

Universität 

Karlsruhe 




d e 

a b 

c 


a, c, d, e, 

Universität 

Karlsruhe 




d e 

a b 

c 


a, c, d, 

Universität 

Karlsruhe 




d e 

a b 

c 


a, c, 

Universität 

Karlsruhe 




d e 

a b 

c 


a, 

Universität 

Karlsruhe 




d e 

a b 

c 


Universität 

Karlsruhe 




d e 

a b 

c 


Theorem 

A PEO allows for an optimal coloring in polynomial time 

The number of colors is bound by the size of the largest clique 

Universität 

Karlsruhe 



Not every graph has a PEO, e.g. 

The graphs which have PEOs are called chordal 

Universität 

Karlsruhe 



Not every graph has a PEO, e.g. 

The graphs which have PEOs are called chordal 

Main result 

The dominance relation in SSA-form programs induces a PEO in the 

interference graph of the program 

Universität 

Karlsruhe 


Overview 



Universität 

Karlsruhe 


SSA-Form 

Each variable has exactly one definition 

⇒ Identity of variables and dynamic constants (values) 

non-SSA 


if b < a 

c = a − b c = 0 

return c 

Universität 

Karlsruhe 


SSA-Form 

Each variable has exactly one definition 

⇒ Identity of variables and dynamic constants (values) 

φ-operations select values dependent on control flow 

non-SSA 


if b < a 

c = a − b c = 0 

return c 

SSA 


if b < a 

c1 = a − b c2 = 0 

c3 = φ(c1, c2) 

return c3 

Universität 

Karlsruhe 


Dominance 

Crucial for SSA-form programs is the concept of dominance: 

Definition 

ℓ1 dominates ℓ2 if each path from start to ℓ2 goes through ℓ1 


if b < a 

c1 = a − b c2 = 0 

c3 = φ(c1, c2) 

return c3 

Universität 

Karlsruhe 


Dominance 

Crucial for SSA-form programs is the concept of dominance: 

Definition 

ℓ1 dominates ℓ2 if each path from start to ℓ2 goes through ℓ1 

 


 

if b < a 

c1 = a − b c2 = 0 

 

c3 = φ(c1, c2) 

return c3 

 

Each node has a unique 

immediate dominator 

Thus, dominance induces a tree 

on the control flow graph 

Thus, dominance is also a 

partial order 

Universität 

Karlsruhe 


Liveness and Dominance 

Lemma (Budimlić, PLDI ’02) 

Each label where a value v is live is dominated by Dv 

v = . . . 

· · · = v 

start 

ℓ : . . . 

Universität 

Karlsruhe 





v = . . . 

· · · = v 

start 

ℓ : . . . 

Proof by contradiction 

Assume ℓ is not dominated by Dv 

Universität 

Karlsruhe 





v = . . . 

· · · = v 

start 

ℓ : . . . 



Then there’s a path from start to 

some usage of v not containing the 

definition of v 

Universität 

Karlsruhe 





v = . . . 

· · · = v 

start 

ℓ : . . . 



Then there’s a path from start to 

some usage of v not containing the 

definition of v 

This cannot be since each value 

must have been defined before it is 

used 

Universität 

Karlsruhe 


Intuition 

Consider a set of intervals in N: 

a 

b 

c 

0 1 2 3 4 5 6 

Each interval corresponds to a lifetime of a variable 

⇒ a node in the interference graph 

Iff two intervals overlap, we draw an edge between then nodes 

a 

b 

c 

Can we make a cycle by drawing an edge from a to e? 

d 

d 

e 

e 

Universität 

Karlsruhe 


Intuition 

Consider a set of intervals in N: 

a 

b 

c 

0 1 2 3 4 5 6 

Each interval corresponds to a lifetime of a variable 

⇒ a node in the interference graph 

Iff two intervals overlap, we draw an edge between then nodes 

a 

b 

c 

Can we make a cycle by drawing an edge from a to e? 

Only by letting a “start again” at 5 

d 

d 

e 

e 

Universität 

Karlsruhe 


Interference and Dominance I 

Assume v, w interfere, i.e. they are live at some label ℓ 

Then, Dv ℓ and Dw ℓ 

Since dominance is a tree, either Dv Dw or Dw Dv 

{, } 

v w 

Universität 

Karlsruhe 


Interference and Dominance I 

Assume v, w interfere, i.e. they are live at some label ℓ 

Then, Dv ℓ and Dw ℓ 

Since dominance is a tree, either Dv Dw or Dw Dv 

Consequence 

{, } 

v w 

Each edge in the interference graph has a direction according to dominance 

Universität 

Karlsruhe 


Interference and Dominance II 

Assume 

 

v w 

Universität 

Karlsruhe 



Assume 

 

v w 

Then, v is live at Dw 

v = . . . 

· · · = v, w 

start 

w = . . . 

Universität 

Karlsruhe 



Consider three nodes u, v, w in the IG: 

v 

u 

 

??? 

 

w 

Universität 

Karlsruhe 




v 

u is live at Dv 

w is live at Dv 

u 

 

??? 

 

w 

Universität 

Karlsruhe 




v 

u is live at Dv 

w is live at Dv 

Thus, they interfere 

Conclusion 

All values 

interfering with v 

whose definitions dominate the one of v 

are members of the same clique 

u 

 

 

w 

Universität 

Karlsruhe 


Dominance and PEOs 

Before a value v can be added to a PEO, all values whose definitions 

are dominated by Dv must be added 

Thus, the post order of a dominance tree walk defines a PEO 

IGs of SSA-form programs can be colored in O(|V | · |E|) 

Ideally without constructing the graph itself 

Universität 

Karlsruhe 


Overview 



Universität 

Karlsruhe 


Spilling 

Theorem 

For each clique in the IG there is a label in the program where all nodes in 

the clique are live. 

Universität 

Karlsruhe 


Spilling 

Theorem 

For each clique in the IG there is a label in the program where all nodes in 

the clique are live. 

b c 

 

a 

 

 

 

 

Dominance induces a chain inside the clique 

⇒ There is a “greatest” value d 

All others are live at Dd 

d 

Universität 

Karlsruhe 


Spilling 

Consequences 

The chromatic number of the IG is exactly determined by the size of 

the live sets of the labels 

Lowering the number of values live at each label to k makes the IG 

k-colorable 

Universität 

Karlsruhe 


Spilling 

Chordal graphs are perfect 

Thus, ω(H) = χ(H) holds for each induced subgraph 

Register pressure is a precise measure for the number of registers 

needed 

We know in advance where values must be spilled 

⇒ All labels where the pressure is larger than k 

Spilling can be done before coloring and 

coloring will always succeed afterwards 

Universität 

Karlsruhe 


Spilling 

Chordal graphs are perfect 

Thus, ω(H) = χ(H) holds for each induced subgraph 


needed 





Conclusion 

No iteration as in Chaitin/Briggs-allocators 

Interference graph has to be built only once (if at all) 

Universität 

Karlsruhe 


Overview 



Universität 

Karlsruhe 


Getting out of SSA 

We now have a k-coloring of the SSA interference graph 

Can we turn it into a valid register allocation using k registers for the 

corresponding non-SSA program? 

Universität 

Karlsruhe 


Getting out of SSA 

We now have a k-coloring of the SSA interference graph 

Can we turn it into a valid register allocation using k registers for the 

corresponding non-SSA program? 

Central question 

How to handle φ-functions? 

Universität 

Karlsruhe 


φ-Functions 

All φ-functions in a basic block 

y1 ← φ(x11, . . . , xn1) 

. 

ym ← φ(x1m, . . . , xnm) 

execute simultaneously before all other instructions in that block 

Arriving from the i-th edge, the φ-functions work as a bulk copy 

(y1, . . . , ym) ← (xi1, . . . , xim) 

Universität 

Karlsruhe 


φ-Functions 

Consider following example 

a3 ← φ(a1, a2) 

b3 ← φ(b1, b2) 

c3 ← φ(c1, c2) 

Universität 

Karlsruhe 


φ-Functions 

Consider following example 

a3 ← φ(a1, a2) 

b3 ← φ(b1, b2) 

c3 ← φ(c1, c2) 

The φs represent register permutations on the control flow edges 

On edge 1 On edge 2 

Universität 

Karlsruhe 


Permutations 

A permutation can be implemented copies with one auxiliary register 

copy = 

copy = 

copy = 

copy = 

Universität 

Karlsruhe 


Permutations 


copy = 

copy = 

copy = 

copy = 

Permutations can be implemented by a series of transpositions 

(i.e. swaps) 

= ◦ 

Universität 

Karlsruhe 


Permutations 


copy = 

copy = 

copy = 

copy = 

Permutations can be implemented by a series of transpositions 

(i.e. swaps) 

= ◦ 

A transposition can be implemented by three xors without a third 

register 

Universität 

Karlsruhe 


Permutations II 

We can replace φ-operations with code requiring no additional register 

A SSA register allocation can be turned into a non-SSA one without 

needing additional registers 

Universität 

Karlsruhe 


Overview 



Universität 

Karlsruhe 


Coalescing 

Minimize number of instructions to be inserted for all φs 

Current practice: Merging nodes in the IG to avoid copies 

Renders the graph unchordal 

Lose information on the chromatic number 

When done aggressively, introduce spills in favor of (eliminated) copies 

Universität 

Karlsruhe 


Coalescing 

Modeling the Problem 

Given: A minimal coloring of the IG 

c ← a + 1 

d ← b + c 

x 

y 

a ← 1 

b ← 2 

 

c g 

← Φ 

b e 

e ← 5 

f ← 2 ∗ b 

g ← a + e 

 

d 

c 

2 

b 

2 

y 

x 

a 

1 

1 

e 

f 

g 

Universität 

Karlsruhe 


Coalescing 

Modeling the Problem 

Given: A minimal coloring of the IG 

Find a feasible coloring with minimal costs 

Costs are a weighted sum over all equal-color edges 

Unused colors may be used 

Structure of graph and program must not be changed 

c ← a + 1 

d ← b + c 

x 

y 

a ← 1 

b ← 2 

 

c g 

← Φ 

b e 

e ← 5 

f ← 2 ∗ b 

g ← a + e 

 

d 

c 

0 

b 

0 

y 

x 

a 

1 

0 

e 

f 

g 

Universität 

Karlsruhe 


Coalescing 

Formal Statement 

Find a k-coloring C of the IG which assigns as many φ-operands and 

results the same color. 

where 

min 

C 

costs (C, y ← φ(x1, . . . , xn)) = 

 

costs(C, φ) 

φ 

n 

i=1 

 

0 if C(y) = C(xi) 

wyxi else 

Universität 

Karlsruhe 


Coalescing 

Solution Strategies 

Complexity 

Problem is NP-complete in number of φs (TODO Rastello) 

Algorithms 

A greedy heuristic 

An optimal method using ILP (integer linear programming) 

Compare solution quality of heuristic vs. ILP 

Universität 

Karlsruhe 


Coalescing 

Heuristic 

Idea: Swap colors to achieve more equal-colored pairs 

Problem: Not decidable locally if swap is possible 

Therefore 

Consider each φ seperately 

Try to give φ-operands and result the same color 

Resolve color clashes recursively through the graph 

On failure mark the conflict locally and repeat 

Universität 

Karlsruhe 


Coalescing 

Heuristic 

Example r = φ(a, b, c, d, e) with cutout of the IG 

a b c d e 

r 

Universität 

Karlsruhe 


Coalescing 

Heuristic 

Example r = φ(a, b, c, d, e) with conflict graph 

a b c d 

Conflict graph represents (in-)compatibilities 

Initially it is a part of the IG 

Maximum stable set is a largest compatible subset of nodes 

r 

Universität 

Karlsruhe 


Coalescing 

Heuristic 


a b c d 




Additional edges for not resolvable global conflicts: 

Register constraints 

Conflict with prior optimizations of other φ-functions 

r 

Universität 

Karlsruhe 


Coalescing 

Heuristic 


a b c d 




Additional edges for not resolvable global conflicts: 

Register constraints 

Conflict with prior optimizations of other φ-functions 

Interaction with another node in same set 

r 

Universität 

Karlsruhe 


Coalescing 

Optimal solution 

How good is the heuristic? 

Comparison with non-SSA allocators is complicated 

Circumstances are too different 

Compare to optimal solution 

How to obtain optimal solutions? 

Backtracking? complex, laborious 

Reduction to ILP! 

Universität 

Karlsruhe 


Coalescing 

Formalization as ILP 

Binary variables represent states/decisions 

Coloring: xic = 1 ⇔ node i has color c 

Optimality: yij = 1 ⇔ node i and j have different colors 

min f = 

where 

we · yij 

e∈Q 

 

xic = 1 vi ∈ V 

c 

xic + xjc ≤ 1 [vi, vj] ∈ E 

yij ≥ xic − xjc [vi, vj] ∈ Q 

yij, xic ∈ {0, 1} 

Universität 

Karlsruhe 


Coalescing 

ILP Example with 3 nodes and 3 colors 

1 2 3 

Universität 

Karlsruhe 


Coalescing 


1 2 3 

min ??? 

where x11 + x12 + x13 = 1 

x21 + x22 + x23 = 1 coloring 

x31 + x32 + x33 = 1 

Universität 

Karlsruhe 


Coalescing 


1 2 3 

min ??? 

where x11 + x12 + x13 = 1 

x21 + x22 + x23 = 1 coloring 

x31 + x32 + x33 = 1 

≤ 1 

x11 + x21 

x12 + x22 ≤ 1 interference 

x13 + x23 

≤ 1 

Universität 

Karlsruhe 


Coalescing 


1 2 3 

min ??? 

where x11 + x12 + x13 = 1 

x21 + x22 + x23 = 1 coloring 

x31 + x32 + x33 = 1 

≤ 1 

x11 + x21 


x13 + x23 

≤ 1 

Universität 

Karlsruhe 


Coalescing 


w 

1 2 3 

min w · y23 

where x11 + x12 + x13 = 1 

x21 + x22 + x23 = 1 coloring 

x31 + x32 + x33 = 1 

≤ 1 

x11 + x21 


x13 + x23 

≤ 1 

y23 

≥ x21 − x31 

y23 ≥ x22 − x32 equal-coloring 

y23 

≥ x23 − x33 

Universität 

Karlsruhe 


Coalescing 


w 

1 2 3 

min w · y23 

where x11 + x12 + x13 = 1 

x21 + x22 + x23 = 1 coloring 

x31 + x32 + x33 = 1 

≤ 1 

x11 + x21 


x13 + x23 

≤ 1 

y23 

≥ x21 − x31 

y23 ≥ x22 − x32 equal-coloring 

y23 

≥ x23 − x33 

Universität 

Karlsruhe 


Coalescing 

Improving the ILP Runtime 

Clique inequalities 

Replace O(n 2 ) inequalities xic + xjc ≤ 1 

with one: 

n 

xic ≤ 1 

i=1 

Universität 

Karlsruhe 


Coalescing 



Path inequalities 

a b c 

d 

e 

Use incompatibility of interference- and 

equal-color-edges 

yad + ycd ≥ 1 

yad + yde + yec ≥ 1 

Universität 

Karlsruhe 


Coalescing 



Path inequalities 

Clique-Path inequalities 

d 

a b c 

Multiple use of same argument, 

⎛ 

here ⎝ a 

⎞ ⎛ ⎞ 

d e 

b ⎠ = Φ ⎝ d f ⎠ 

c 

d g 

results in yad + ybd + ycd ≥ 2 

Universität 

Karlsruhe 


Quality of the Coalescing Heuristic 

Applied to SPEC 2000 

Costs 

1000 

400 

300 

200 

100 

0 

8 16 32 

Registers 

Initial 

Heuristic 

ILP 

Registers 8 16 32 

Non-Opt 6.7% 3.7% 1.3% 

Initial 394592 342842 213544 

Heuristic 60114 63506 46060 

ILP 42738 57010 43479 

Elim 95.0% 97.7% 98.4% 

Universität 

Karlsruhe 


Runtime of the Heuristic with 8 (16 / 32) Registers 

ms 

1000 

100 

10 

1 

0 250 500 750 1000 1250 1500 1750 2000 2250 2500 

99% of all Problems in less than 68 (196 / 170) ms. 

on average 4 (14 / 21) ms. 

|Q| 

Universität 

Karlsruhe 


Coalescing - Influence of Registers 

Costs 

1000 

350 

300 

250 

200 

150 

100 

50 

0 

4 6 8 10 12 14 16 24 32 40 48 56 64 128256 

ILP Heur Initial #Registers 

Universität 

Karlsruhe 


Overview 



Universität 

Karlsruhe 


Conclusions 

SSA vs. non-SSA 

SSA-Construction introduces copies 

(φ-operations are copies along edges) 

These copies “blow up” the interference graph 

Nodes are replaced by stable sets 

b 

c 

a 

d 

e 

⇒ 

Breaks cycles in the interference graph 

The interference graph becomes chordal 

b 

c 

a 

d 

e1 

e2 

e3 

Universität 

Karlsruhe 


Conclusions 

Non-SSA vs. SSA 

SSA-Destruction coalesces copies aggressively 

without considering the number of available registers 

Stable sets merged into nodes 

b 

c 

a 

d 

Possibly creating cycles 

e1 

e2 

e3 

Possibly increasing the chromatic number of the graph 

⇒ 

b 

c 

a 

d 

e 

Universität 

Karlsruhe 


Conclusions 

Chordality of SSA IGs allows for decoupling spilling and coalescing 


needed 





Coalescing re-expressed by introducing a cost function on colorings 

and finding a preferably good coloring 

Universität 

Karlsruhe 


Conclusions 

Chordality of SSA IGs allows for decoupling spilling and coalescing 


needed 





Coalescing re-expressed by introducing a cost function on colorings 

and finding a preferably good coloring 

Architecture without iteration 

Spill Color Coalesce SSA-Destruction 

Universität 

Karlsruhe 


Thank you very much! 

Universität 

Karlsruhe

Register Allocation for SSA-Form Programs

Create successful ePaper yourself

Delete template?

Save as template?