05.10.2013 Views

1 Code Generation Code generator phase ... - VTU e-Learning

1 Code Generation Code generator phase ... - VTU e-Learning

1 Code Generation Code generator phase ... - VTU e-Learning

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Code</strong> <strong>Generation</strong><br />

<strong>Code</strong> <strong>generator</strong> <strong>phase</strong> generates the target code taking input as intermediate code. The output<br />

of intermediate code <strong>generator</strong> may be given directly to code generation or may pass through<br />

code optimization before generating code.<br />

Issues in Design of <strong>Code</strong> generation:<br />

Target code mainly depends on available instruction set and efficient usage of registers. The<br />

main issues in design of code generation are<br />

• Intermediate representation: Linear representation like postfix and three address<br />

code or quadruples and graphical representation like Syntax tree or DAG. Assume<br />

type checking is done and input in free of errors. This chapter deals only with<br />

intermediate representation as three address code.<br />

• Target <strong>Code</strong>: The target code may be absolute code, re-locatable machine code or<br />

assembly language code. Absolute code can be executed immediately as the<br />

addresses are fixed. But in case of re-locatable it requires linker and loader to place<br />

the code in appropriate location and map (link) the required library functions. If it<br />

generates assembly level code then assemblers are needed to convert it into machine<br />

level code before execution. Re-locatable code provides great deal of flexibilities as<br />

the functions can be compiled separately before generation of object code.<br />

• Address mapping: Address mapping defines the mapping between intermediate<br />

representations to address in the target code. These addresses are based on the<br />

runtime environment used like static, stack or heap. The identifiers are stored in<br />

symbol table during declaration of variables or functions, along with type. Each<br />

identifier can be accessed in symbol table based on width of each identifier and offset.<br />

The address of the specific instruction (in three address code) can be generated using<br />

back patching<br />

• Instruction Set: The instruction set should be complete in the sense that all<br />

operations can be implemented. Some times a single operation may be implemented<br />

using many instruction (many set of instructions). The code <strong>generator</strong> should choose<br />

the most appropriate instruction. The instruction should be chosen in such a way that<br />

speed is of execution is minimum or other machine related resource utilization should<br />

be minimum.<br />

Example: Consider the set of statements<br />

a = b * c<br />

d = a * e<br />

Three address code will be as followssholu<br />

t1 = b * c<br />

t2 = t1 + 10<br />

t3 = t1 + t2<br />

Final code generated will be as follows<br />

1


MOV b, R0 / load b to register Ro,<br />

MUL C, R0<br />

MOV.R0, a Mov a to Ro and moving Ro to a can be eliminated<br />

MOV a, R0<br />

MUL e, R0<br />

MOV R0, d<br />

Redundant instruction should be eliminated.<br />

Replace n instruction by single instruction<br />

x = x + 1<br />

MOV x, R0<br />

ADD 1, R0 ⇒ INC x<br />

MOV R0. x<br />

Register allocation: If the operands are in register the execution is faster hence the set of<br />

variables whose values are required at a point in the program are to be retained in the<br />

registers.<br />

Familiarities with the target machine and its instruction set are a pre-requisite for designing a<br />

good code <strong>generator</strong>.<br />

Target Machine: Consider a hypothetical byte addressable machine as target machine. It<br />

has n general purpose register R1, R2 ------- Rn. The machine instructions are two address<br />

instructions of the form<br />

op-code source address destination address<br />

Example:<br />

MOV R0, R1<br />

ADD R1, R2<br />

Target Machine supports for the following addressing modes<br />

a. Absolute addressing mode<br />

Example: MOV R0, M where M is the address of memory location of one of the<br />

operands. MOV R0, M moves the contents of register R0 to memory location M.<br />

b. Register addressing mode where both the operands are in register.<br />

Example: ADD R0, R1<br />

c. Immediate addressing mode – The operand value appears in the instruction.<br />

Example: ADD # 1, R0<br />

2


d. Index addressing mode- this is of the form C(R) where the address of operand is at the<br />

location C +Contents(R)<br />

Example: MOV 4(R0), M the operand is located at address = contents<br />

(4+contents(R0))<br />

Cost of instruction is defined as cost of execution plus the number of memory access.<br />

Example:<br />

MOV R0, R1, the cost = 1 as there are no memory access.<br />

Where as MOV R0, M cost = 2.<br />

Register and address descriptor<br />

Register descriptor gives the details of which values are stored in which registers and the list<br />

of registers which are free.<br />

Address descriptor gives the location of the current value can be in register, memory location<br />

or stack based on runtime environment.<br />

<strong>Code</strong> generation algorithm<br />

Consider the simple three address code for which the target code to be generated.<br />

Example: a = b op c<br />

i.Consult the address descriptor for ‘b’ to find out whether b is in register or memory<br />

location. If b is in memory location, generate code.<br />

a. MOV b, Ri where Ri is one of the free register as per register descriptors.<br />

Update address descriptor of b and register descriptor for free registers.<br />

ii.Generate code for OP C, where C can be in memory location or in register.<br />

iii.Store result ‘a’ in location L. L can be memory location M or register R, based on<br />

availability of free register and further usage of ‘a’. Update register descriptor and<br />

address descriptor for ‘a’ accordingly.<br />

Example: x = y + z<br />

Check for location of y,<br />

Case 1: If y is in register R0 and z may be in register or memory. The instructions will be<br />

ADD z, R0<br />

MOV R0, x<br />

In this case the result x has to be stored in memory location x.<br />

Case2: If y is in memory, fetch y to register, update address and register descriptor<br />

3


MOV y, R0<br />

ADD z, R0<br />

MOV R0, x<br />

Example:<br />

P = (x – y) + ( x – z) + ( x – z)<br />

t1 = x – y<br />

t2 = x – z<br />

t3 = t1 + t2<br />

t4 = t3 + t2<br />

Three address code<br />

3 addr M/c <strong>Code</strong> Cost Reg desc Addr desc<br />

t1 = x – y MOV x , R0 2 R0 has t1 t1 in R0<br />

SUB y , R0 = 2<br />

t2 = x – z MOV x , R1 2 R0 has t1 T in R0<br />

SUB z, 12 2 R1 has t2 U in R1<br />

t3 = t1 + t2 ADD R1 , R0 1 R0 has t3<br />

t4 = t3 + t2<br />

ADD R1, R0<br />

MOV R0, t4<br />

1<br />

2<br />

R1 has t2<br />

Example: Generate code for instruction x = y[i] and x [i]=y<br />

t2 in R1<br />

t3 in R0<br />

R0 has t4 t4 in R0 and<br />

memory<br />

Stmt i in reg Ri i in Memory i in Stack<br />

<strong>Code</strong> Cost <strong>Code</strong> Cost <strong>Code</strong> Cost<br />

x = y [i] MOV y (Ri), R 2 MOV M, R 4 MOV Si (x), R 4<br />

MOV b (R1, R2) MOV y (R), R<br />

x [i] = y MOV y, x (Ri) 3 MOV M, R 5 MOV Si(x),x<br />

MOV y, x (R)<br />

5<br />

<strong>Code</strong> generation for function call<br />

<strong>Code</strong> generation for function code is base on the runtime storage. The runtime storage can by<br />

static allocation or stack allocation. In case of static allocation the position of activation<br />

record in memory is fixed at the compile time. To recollect about activation record,<br />

whenever a function is called, activation records are generated, these records store the<br />

parameters to be passed to functions, local data, temporaries, results & some machine status<br />

information along with the return address. In case of stack allocation, every time a function<br />

is called, the new activation record in generated & is pushed onto stack, once the function<br />

completes, the activation record is popped from stack. The three address code for function<br />

call consists of following statements<br />

4


1. Call.<br />

2. Return<br />

3. end<br />

4. action<br />

Call statement is used for function Call, it has to mail the control to the function along with<br />

saving the status of current function.<br />

Return statement is used to give the control back to called function. Action defines other<br />

operations or instructions for assignment or flow control statements. End indicates the<br />

completion of operations of called function.<br />

Static allocation: This section describes the final code generation for function calls, where<br />

static allocation is used as runtime environment.<br />

• Call statement : The code generated for call stmt is as follows.<br />

MOV # current + 20, function.static_area<br />

GOTO function.code_area<br />

# current + 20 indicates the address of next instruction to which the return of function, i.e, the<br />

instruction of called function which has to be executed after the called function completes<br />

execution. 20 defines the size of goto statement following call stmt.<br />

Function.static_area defines the address of activation record of function. Function.code_area<br />

defines the address of 1 st instruction of called function.<br />

• Return Statement: <strong>Code</strong> generated for return stmt is<br />

goto * function.static_area.<br />

This allows the control back to the called function.<br />

Example:<br />

/* code for main */<br />

action 1<br />

call fun<br />

action 2<br />

end<br />

/* code for fun */<br />

action 3<br />

5


eturn<br />

Three address code that will be generated for the above set of statements is as follows.<br />

10: action 1<br />

20: MOV # 40, 200 /* Save return address 40 at location 200 */<br />

30: GOTO 100<br />

40: Action 2<br />

50: end<br />

/* code for function */<br />

100: action 3<br />

100: GOTO * 200<br />

200: 40(return address)<br />

Stack allocation: Whenever the function is called the activation record of called fun c is<br />

stored on Stack, once the function returns, it is removed from Stack. Final code that will be<br />

generated for stack area for initialize the Stack is<br />

MOV # Stack.begin, SP /* initialize the Stack Pointer */<br />

SP denotes Stack Pointer.<br />

<strong>Code</strong> for Call statement is as follows<br />

Add # main.record size, SP /*main.recordsize referes to<br />

record size of caller function*/<br />

MOV # current +16, *SP /*Save return address*/<br />

GOTO function.code_area<br />

Return statement has the following target code.<br />

GOTO *0(SP)<br />

SUB # main.recordsize, SP<br />

Example: For the below three address code<br />

/* code for a */<br />

action1<br />

call c<br />

action 2<br />

end<br />

6


* code for b */<br />

action 3<br />

return<br />

/* code for c */<br />

action 4<br />

call b<br />

action 5<br />

call c<br />

action 6<br />

call c<br />

return<br />

The final code generated will be as follows:<br />

/* code for a * /<br />

100: MOV # 600, SP // initialize stack<br />

110: action 1<br />

120: ADD # a.size, SP<br />

130: MOV # 150, * SP<br />

140: GOTO 300<br />

150: SUB # a_size, SP<br />

160: action 2<br />

170: end<br />

/* code for b */<br />

200: action 3<br />

210: GOTO * 0(SP)<br />

/* code for c */<br />

300: action 4<br />

310: ADD # c_size, SP<br />

7


320: MOV #340, *SP<br />

330: GOTO 200<br />

340: SUB # c_size, SP<br />

350: action 5<br />

360: ADD # C_Size_SP<br />

370: MOV # 390, * SP<br />

380: GOTO 300<br />

390: SUB # C_Size_SP<br />

400: Action 6<br />

410: ADD # C_Size_SP<br />

420: MOV # 440, * SP<br />

430: GOTO 300<br />

440: SUB # C_Size_SP<br />

450: GOTO *0(SP)<br />

600: Stack Starts here<br />

<strong>Code</strong> Optimization<br />

<strong>Code</strong> Optimization <strong>phase</strong> in mainly use to optimize the code for better utilization of memory and<br />

reduce the time taken for execution. <strong>Code</strong> optimization takes input from intermediate code <strong>generator</strong><br />

and performs machine independent optimization. <strong>Code</strong> optimizer may also take input from code<br />

<strong>generator</strong> and perform machine dependent code optimization. Compilers that use code optimization<br />

transformations are called as optimizing compilers. <strong>Code</strong> optimization does not consider target<br />

machine properties for optimization (like register allocation and memory management) if input is<br />

from intermediate code <strong>generator</strong>.<br />

<strong>Code</strong> optimization tries to optimize that part of the code which are executed more number of times,<br />

like statements within flow control block of for statement and while statement. This is because the<br />

most programs always spend maximum execution time on executing only few statements <strong>Code</strong><br />

optimization analysis programs in two levels control flow analysis and data flow analysis. In control<br />

flow analysis code optimization concentrates more on improving the code of inner loops than outer<br />

statements, as inner loops are executed more number of times than outer ones. A detailed data flow<br />

analysis is required for debugging the optimized code. Data flow analysis collects the information of<br />

statistics about statements being executed more number of times. This information is used in the<br />

process if optimization. <strong>Code</strong> optimization should be such that best results crop up with minimum<br />

effort.<br />

8


<strong>Code</strong> Optimization has to mainly achieve two goals<br />

1. Preserve the meaning of code – The output generated before (without) <strong>Code</strong> Optimization<br />

should be same as the code after optimization.<br />

2. Optimization should reduce the cost of execution considerably. The effort spent on code<br />

optimization should be worth it.<br />

It implies that amount of time taken for optimization should be very less when compared to the<br />

reduction of overall execution time. Generally, a fast non optimizing compilers are preferred for<br />

debugging programs<br />

<strong>Code</strong> improvement always need not be in code optimization <strong>phase</strong>. It can be incorporated in source<br />

program or in intermediate code or on target code. In source program say, for sorting program, user<br />

can choose different algorithm based on the cost function like minimum space or minimum time.<br />

Each algorithm can be efficient it its own way or other, like quick sort is very fast on unsorted/random<br />

array where as other sorting like bubble sort is efficient on partially sorted array. Intermediate code<br />

can be improved by improving loops and efficient address calculation may give better results. In final<br />

code generation <strong>phase</strong>, optimized code can be efficiently generated by selecting appropriate<br />

instruction, use registers efficiently and some instruction transformations. Example: Keeping most<br />

used variables in registers which avoids frequent fetching and storing in memory location. This<br />

chapter deals with optimization of intermediate code represented as three address code. Intermediate<br />

code is relatively independent at target machine so optimization is machine independent.<br />

Programs are represented as flow graphs to study control flow and temporary variables are used to<br />

store intermediate results help in data flow analysis. It is seen that compilation speed is proportional to<br />

the size at program being compiled hence amount of time taken for code optimization should be<br />

relatively less.<br />

Principal of code optimizations<br />

This sections deals with identifying that part of the program where optimization is required. By using<br />

the concept of proper register allocation, elimination of dead code and finding the cost of instruction,<br />

it is possible to improve the efficiency of program statements.<br />

Unnecessary Operation<br />

In a program there may be some part of code which never executes. It would be waste to generate<br />

code for these statements. It may also happen that some of the values of temporary variables may<br />

never be used. These are called as dead codes, it has to be removed. There can be some subexpression<br />

whose value is computed many times. This can be optimized by calculating the value of<br />

sub-expression only once and other statements can just use this value.<br />

Example:<br />

x = 1<br />

while (x != 1)<br />

{ …}<br />

Statements of while never executed hence do not generate code for statements within while statement.<br />

Example:<br />

9


x = y + z<br />

a = x + 10<br />

p = y + z<br />

b = p + 20<br />

Both x and p computes same sub-expression hence generate code for x only once and p uses value of<br />

x instead of re-computing from x & y.<br />

After intermediate code generation it may so happen that there can be a jump statement whose target<br />

statement is next statement itself. In this case jump statement should be avoided, which reduces code<br />

generating time.<br />

Constant Folding : If the assignment statement consists of only constants to the right hand side of<br />

assignment statement. Then the value of the expression can be pre-computed.<br />

Example: y = 2 * 5 + 6<br />

The value of y can be computed as 16 and stored. Then the three address code generated would by<br />

y=16 instead of<br />

t1 = 2 * 5<br />

t2 = t1 + 6<br />

y = t2<br />

This helps in constant propagation i.e, from the above example if y is used in any other expression,<br />

instead of substituting y = 2 * 5+6 it can be substituted with y = 16.<br />

Example:<br />

y = 2 * 5 + 16<br />

x = y + z<br />

without optimization<br />

x = 2 * 5 + 6 + z<br />

with optimization<br />

y = 16<br />

x = 16 + z<br />

Some of the operations like procedure call are very expensive, especially recursive procedure calls. In<br />

order to reduce this, recursive procedures may be converted to iterative by providing lables. Issues<br />

regarding procedure call it that before transferring the control to procedure. The status of procedure<br />

has to be stored in registers. It has to be restored after procedure returns. Hence increases load and<br />

store instructions.<br />

Predicting program behavior<br />

10


In order to generate more optimized code, <strong>Code</strong> optimization <strong>phase</strong> has to find out number of<br />

variables used, their value set, those expressions which are used many times. It should also perform<br />

some statistical analysis like-part of the code never reached, part of code which will executed many<br />

times, procedures likely to be called. This information helps in adjusting loop structure and procedure<br />

code to minimize execution speed.<br />

Other Methods of Optimizations<br />

Some of the optimization techniques are used to improve the loop statements. These are code motion<br />

and reduction in strength of expression.<br />

<strong>Code</strong> Motion:<br />

Optimization is done for those statements which are executed frequently. Hence the statements whose<br />

values do not change with respect to loop invariants should be removed from the loop.<br />

Example:<br />

a = 1;<br />

while (a! = 10)<br />

{<br />

}<br />

b = x + 100;<br />

a = a + 1 ;<br />

printf(“%d”,a);<br />

In the above example, variable b with in while loop, is independent of loop invariant a and the value<br />

of x do not change inside loop, hence b = x + 100 can be executed before while loop or after while<br />

loop.<br />

b = x + 100;<br />

a = 1;<br />

while (a ! = 0 )<br />

{ a = a + 1;<br />

}<br />

printf (“%d”,a);<br />

Reduce the strength of expression: If the intermediate code consists of multiplication or division, it<br />

can be replaced by addition or subtraction, this reduces the strength of expression.<br />

Example:<br />

while ( i > 10)<br />

{ i = i + 1;<br />

11


}<br />

t1 = 4 * i;<br />

The statement with in while loop, will be executed until ‘i’ greater than 10. Initially if i = 0, for the<br />

first iteration i = 1 and t1 = 4, for the 2 nd instruction i = 2 and t1 = 4 * 2 = 8<br />

or t1 = 4 * (i + 1)<br />

t1 = 4 * i + 4 (∴ t1 = 4 * i )<br />

t1 = t1 + 4<br />

As the expression for evaluating t1 which requires multiplication is reduced to addition, its execution<br />

is faster.<br />

Local, Global & Inter-Procedural Optimization:<br />

In case of local optimization straight line codes with in basic block are optimized. The basic block<br />

consists of only assignment statements with no jumps or loops. Some of the optimization techniques<br />

that can be used for local optimization are constant folding, constant propagations and algebraic<br />

transformations.<br />

Optimization considering many basic blocks of single procedure is called global optimization. They<br />

use optimization techniques like code motion, elimination of induction variables and reduction in<br />

strength of expression. Global optimization requires data flow analysis to detect jump boundaries<br />

before optimization.<br />

Inter-procedural optimization deals with optimization of entire program as a whole. This is very<br />

difficult to achieve as it has to take care of different parameters passing mechanization and non local<br />

variable access. The advantage of inter procedural optimization is that each procedure can be<br />

optimized independently and linked together at the end with the help of linker which performs<br />

optimization later on.<br />

Machine dependent optimization<br />

Some of the optimizations are machine independent, like register allocation and cost of instruction.<br />

Register Allocation:<br />

Number of times variable in each block of program may vary, but there are fixed number at register in<br />

the system. Hence these registers are to be efficiently used. As far as possible the temporary variable<br />

or intermediate values should be present in register this reduces the load and store to memory.<br />

Example:<br />

x = y + z<br />

a = x + 10<br />

b = x + 20<br />

As the value of x is used after it has been assigned a value. Retain the value of x in the register, to<br />

avoid storing and reloading from memory.<br />

Cost of Instructions:<br />

12


Each instruction takes some machine cycles to perform the operation. The optimization strategies<br />

should be such that it should reduce the number of machine cycles or in other words the strength of<br />

instruction should be reduced to have better optimization.<br />

Example:<br />

x 2 can be replaced by expression x * x.<br />

Expressions like adding 0 or multiplying by 1 can be removed, as these do not change the value of<br />

variable.<br />

Example:<br />

1) x = x + 0<br />

2) a = a * 1<br />

These instructions can be eliminated as they do not change the value of x and a.<br />

This is called algebraic transformation.<br />

Data Structure:<br />

Syntax trees can be used for some of the optimization techniques like constant folding, constant<br />

propagation etc., but for optimization like eliminating loop invariant, or dead code elimination, it is<br />

not very efficient, Specially for global optimization syntax tree is not efficient as it requires the study<br />

for control flow. Hence flow graphs are used. Flow graphs consist of basic blocks as nodes and edges<br />

connecting basic blocks indicate the control flow. The sequence of three address statement is<br />

converted to flow graph using following steps.<br />

1. Construct basic block<br />

2. Generate flow graph<br />

1. Construction of Basic Blocks<br />

a) Determine set of header statements. Header statements are the first statement of each basic<br />

block.<br />

b) First statement is a header statement<br />

c) Any statement which is target of conditional or unconditional jump is a header statement.<br />

d) Any statement following conditional or unconditional jump is a header statement.<br />

2. Construct flow graph<br />

Construct graph with B1 as the stating node where B1 is basic block which has first statement of the<br />

program. Generate edge from Bi to Bj if control flows from block Bi to Bj. Entry for any block Bk<br />

will be from the first statement of Bk and exit from Bk will be from last statement only. No<br />

intermediate jump or return can happen in the basic block.<br />

Example: Consider the following C statement<br />

for i = 1 to n do<br />

13


for i =1 to n do<br />

C[i, j] = 0;<br />

Three address code generated will be as follows<br />

1) i = 0<br />

2) if i < n go to 4<br />

3) go to 15<br />

4) j = 1<br />

5) if j < n go to 7<br />

6) go to 13<br />

7) t1 = i * 10<br />

8) t2 = t1 + j<br />

9) t3 = 4 * t2<br />

10) C[t3] = 0<br />

11) j = j + 1<br />

12) go to 5<br />

13) i = i + 1<br />

14) go to 2<br />

15)<br />

Basic blocks will be as follows<br />

Stmt no Header Three address code Block no<br />

1 H i = 0 B1<br />

2 H if i < n go to 4 B2<br />

3 H go to 15 B3<br />

4 H j = 1 B4<br />

5 H if j < n go to 7 B5<br />

14


6 H go to 13 B6<br />

7<br />

8<br />

9<br />

10<br />

11<br />

12<br />

13<br />

14<br />

15<br />

H t1 = i * 10<br />

t2 = t1 + j<br />

t3 = 4 * t2<br />

C[t3] = 0<br />

j = j + 1<br />

go to 5<br />

H i = i + 1<br />

go to L6<br />

Flow graph for the basic blocks is as follows in Fig 9.1<br />

B6<br />

B8<br />

Directed Acyclic Graph<br />

B4<br />

B5<br />

B7<br />

B1<br />

B2<br />

B3<br />

B7<br />

B8<br />

Fig 9.1 Flow graph for the basic blocks<br />

15


Flow graphs are mainly used for global optimization. These are not very efficient for local<br />

optimizations on basic blocks. Hence Directed A cyclic Graphs (DAG) is used. Leaves of DAG are<br />

used to represent variable names or constants. Interiors nodes and root of DAG is used to represent<br />

operator symbol. Nodes have label which denotes the most recent value for the variables.<br />

For any statement a = b op C the DAG is in Fig 9.2<br />

b c<br />

Fig 9.2 DAG for the expression a = b op c<br />

b,c the leaves represents variables. Interior node OP represents operator OP and a is the label for OP<br />

which gives the value of b OP c.<br />

For exp like x = y no node is created for x. Only the label y will be added to the node which had label<br />

x.<br />

Example: Consider the following code<br />

t1 = a + b<br />

t2 = t1<br />

Fig 9.3 DAG for the three address code<br />

DAG for the three address code is represented in Fig 9.3. For 2 nd expression no new node is created,<br />

but it will use the same node +. Initially t1 will be the label of + after 2 nd statement t2 is also added as<br />

label of ⊕<br />

Example: Consider the following statements<br />

a = b + c<br />

b = a – d<br />

t2, t1<br />

*<br />

a b<br />

a<br />

OP<br />

b1<br />

Ө<br />

a0 ⊕ d0<br />

b0 c0<br />

16


Fig 9.4 DAG for the three address code<br />

Fig 9.4 shows the DAG for the above three address code<br />

Example: Consider the following statements<br />

c = c + d<br />

e = b + c<br />

Fig 9.5 shows the DAG for the above three address code<br />

Fig 9.5 DAG for the three address code<br />

Example: Consider the following expression<br />

a = -b * c + d<br />

The three address code will be<br />

t1 = –b<br />

t2 = t1 * c<br />

t3 = t2 + d<br />

a = t3<br />

b<br />

⊕ e<br />

c1 ⊕ b<br />

c0 d0<br />

a, t3 ⊕<br />

t2 * d<br />

t1Ө c<br />

17


Fig 9.6 DAG for the three address code<br />

Fig 9.6 shows the DAG for the expression a = -b * c + d represented as three address code<br />

Example: Consider the following expression<br />

a = b * d + b * d + c<br />

The three address code will be<br />

t1 = b * d<br />

b2 = b * d<br />

t3 = t1 + t2<br />

t4 = t3 + c<br />

a = t4<br />

Fig 9.7 DAG for the three address code<br />

Fig 9.7 shows the DAG for the expression a = b * d + b * d + c represented as three address code.<br />

From the above DAG it is found that node * has 2 labels t1 & t2. Hence there is no necessary to<br />

generate code twice for the same expression. Final code can be generated from DAG by topological<br />

sorting. Topological sorting is the traversal of tree from leaf to root in which children are visited<br />

before their parents. As there can be multiple topological sorts. There can be many code sequences for<br />

single DAG.<br />

Example:<br />

Consider intermediate code<br />

t1 = a + b<br />

a = t1<br />

t2 = b – 1<br />

b = t2<br />

t3 = b + 5<br />

t1, t2 *<br />

b d<br />

t4 , a ⊕<br />

t3 ⊕ c<br />

⊕ t3<br />

a1 t2 ⊕ Ө t2 , b1 5<br />

a0 b 1<br />

18


After topological sorting<br />

t2 = b – 1<br />

t1 = a + b<br />

a = t1<br />

t3 = b + 5<br />

b = t2<br />

Fig 9.8 DAG for the intermediate code<br />

Reordering of code helps in eliminating unnecessary use of temporaries. Hence the code would be as<br />

follows.<br />

a = a + b<br />

b = b – 1<br />

t3 = b + 5<br />

DAG gives the information of how many references exists for node. This helps in good register<br />

allocation. If a value has many references then it can be retained in registers for long time. If the<br />

value has no reference it can be removed from the register.<br />

19

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!