PROGRAM STRUCTURE TREES - Software Systems Lab
PROGRAM STRUCTURE TREES - Software Systems Lab
PROGRAM STRUCTURE TREES - Software Systems Lab
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
1 introduction<br />
Nowadays modern static analysis tools as well as optimizing compilers<br />
apply powerful optimization techniques and analysis on programs.<br />
To achieve the best results possible analysis and transformations are<br />
developed that are powerful, but often only applicable if the program<br />
to analyse fulfilles certain properties. In general a complete program<br />
does not fulfill the restrictions imposed by the intended analysis.<br />
However it is possible to extract and analyse just the regions of a<br />
program, that satisfy the required restrictions. A way to find these<br />
regions is to find all possible regions and to remove the ones, that do<br />
not satisfy the restrictions.<br />
Therefore it is interesting to understand the different algorithms<br />
available to detect the regions in a program and to investigate the<br />
(dis)advantages each of them has.<br />
2 basic components<br />
2.1 Control flow graph<br />
In compilers the code of a function as seen in Figure 1 can be described<br />
using a control flow graph (CFG) G. G = (V, E) consists of a set of<br />
vertices V, called basic blocks, and a set of edges E connecting these<br />
basic blocks. Every basic block contains a list of statements.<br />
The execution of a function is defined as a walk over the CFG, where<br />
every time a basic block is passed its statements are executed in linear<br />
order. The walk starts always at a specific basic block, the entry basic<br />
block, and ends if it arrives at a basic block, that is terminated with a<br />
”return” statement.<br />
To represent non linear control flow, branch statements may terminate<br />
a basic block. These branch statements pass, based on the result of a<br />
condition, the control to another basic block. The control flow is always<br />
following the edges of the CFG.<br />
2.2 (Simple) Region<br />
A connected subgraph of the CFG, that has only two connections to<br />
the remaining CFG, an incoming and an outcoming edge, is called a<br />
(single entry single exit) region. Such a region can be analyzed and<br />
transformed like a separate function. This can be modeled as seen in 2<br />
by replacing the orange region with a call to a function, that contains<br />
the orange CFG region. Moving or replacing the entire region is as<br />
simple as moving two edges in the CFG or if extracted as a function,<br />
changing a function call.<br />
A region is called trivial region, if it contains exactly one basic block.<br />
A region A is called canonical region, if it there is no set of regions<br />
that can be combined to construct A.<br />
2.3 Refined Region<br />
The definition of a region can be extended to a so called refined region.<br />
A refined region is a connected subgraph of a CFG, that can be transformed<br />
to a region by inserting two empty bbs, that join multiple entry<br />
or exit edges.<br />
2