15.11.2012 Views

Parallelized Critical Path Search in Electrical Circuit Designs

Parallelized Critical Path Search in Electrical Circuit Designs

Parallelized Critical Path Search in Electrical Circuit Designs

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

to be done. In addition, the <strong>in</strong>sertion of new elements to the<br />

result list is not applicable.<br />

4. The Arc-Flag Approach<br />

The Arc-Flag approach [16] presumes that a graph has<br />

already been divided <strong>in</strong>to partitions. It is irrelevant which<br />

of the partition<strong>in</strong>g methods has been used to perform the<br />

partition<strong>in</strong>g. The Arc Flag approach calculates all shortest<br />

paths for each possible entry po<strong>in</strong>t <strong>in</strong>to a region to all possible<br />

exit po<strong>in</strong>ts. The path to the exit and the name of the exit<br />

po<strong>in</strong>t is stored at each entry po<strong>in</strong>t. This annotation is called<br />

the arc to the exit.<br />

The arcs can be created dynamically. Each time a shortest<br />

path calculation is performed and a path through a region<br />

from a new entry po<strong>in</strong>t is requested, this shortest path<br />

is calculated and stored at the entry po<strong>in</strong>t. If a shortest path<br />

calculation enters a region aga<strong>in</strong> at a po<strong>in</strong>t where the shortest<br />

path through the region has been calculated before, then<br />

the stored path is used. This saves a lot of calculation time,<br />

because the shortest path search through this partition will<br />

not perform aga<strong>in</strong>. From time to time, more and more paths<br />

will be cached. The more shortest path calculations are performed,<br />

the more speed-up can be achieved.<br />

The calculation of the arcs can also be performed as a<br />

preprocess to the actual shortest path calculation. As a result,<br />

all shortest path pairs through a partition are already<br />

known when a shortest path search on the graph is performed.<br />

4.1. Preprocess<strong>in</strong>g the Graph<br />

Preprocess<strong>in</strong>g a graph is needed to calculate and store<br />

the Arc-Flag entries for each region of a static partitioned<br />

graph. Therefore, a one-to-all shortest path computation us<strong>in</strong>g<br />

a standard Dijkstra algorithm (all-pairs shortest-path) is<br />

performed. This Dijkstra run can be <strong>in</strong>terrupted if all nodes<br />

<strong>in</strong> a region are marked as visited.<br />

In the worst case scenario, (n nodes and m pairs) the<br />

complexity is: O(m(m + n + n log n)) with m = O(n)<br />

this will result <strong>in</strong>: O(n 2 log n). For large n, it is obvious<br />

that this preprocess<strong>in</strong>g takes far too long.<br />

There are two possible solutions suggested by Moehr<strong>in</strong>g<br />

et al. [16]. First, they showed that it is possible to preprocess<br />

the graph without calculat<strong>in</strong>g all-pairs of shortest paths. F<strong>in</strong>ally,<br />

the storage of pruned shortest path trees can help to<br />

avoid this complexity problem.<br />

4.2. Two-Level Partition<strong>in</strong>g Arc Flag Approach<br />

Us<strong>in</strong>g one of the partition<strong>in</strong>g methods mentioned earlier,<br />

<strong>in</strong> comb<strong>in</strong>ation with the Arc Flag approach and preprocess-<br />

Figure 3. Coars<strong>in</strong>g<br />

<strong>in</strong>g the graph, the region conta<strong>in</strong><strong>in</strong>g the target node can be<br />

reached very fast. However, <strong>in</strong>side the target region, the<br />

path from the entrance po<strong>in</strong>t to the target node needs to be<br />

found by a separate search. Depend<strong>in</strong>g on the granularity<br />

of the partition and the size of each region, such a search<br />

may need to visit many nodes <strong>in</strong> the region where the target<br />

belongs to. To avoid this bad behavior, two partition levels<br />

can be used. The first partition level is a coarse level while<br />

the second one is a detailed level.<br />

As an optimization, the detailed level could be stored<br />

only for heavy loaded regions. In Figure 3, a 5 × 5 coarse<br />

partition and a 3 × 3 f<strong>in</strong>e partition for each coarse partition<br />

is used. Coarsen<strong>in</strong>g and stor<strong>in</strong>g detailed levels is more<br />

memory efficient than us<strong>in</strong>g a 15 × 15 grid <strong>in</strong> this case.<br />

5. Experiment<br />

In this section, we evaluate three methods: bidirectional<br />

search, reachable function and our adapted Arc-Flag approach.<br />

We compare them aga<strong>in</strong>st a base algorithm, i.e.<br />

the standard Dijkstra algorithm. The bidirectional search<br />

is chosen because it is a very simple method, whereas<br />

the reachable function is compared due to its fast runtime.<br />

In addition, we compare Intel C++ compiler [10] (Intel-<br />

10.0.023) that supports auto-parallelization with gcc version<br />

3.4.6.<br />

For the test environment, we use an Opteron server with<br />

two Dual Core AMD Opteron processors (275 HE) with 2.2<br />

GHz. Thus, four CPU cores available for calculations. This<br />

mach<strong>in</strong>e is equipped with 16 GB of physical ma<strong>in</strong> memory.<br />

From each core, all the available memory can be accessed.<br />

The available hard disk space is add<strong>in</strong>g up to 1 TB mirrored<br />

us<strong>in</strong>g a RAID level 1. As for the operat<strong>in</strong>g system, a 64-bit<br />

Red Hat Enterprise L<strong>in</strong>ux Workstation version 4 (update 4)<br />

is <strong>in</strong>stalled.<br />

For the test data, we use the freely available chip design,<br />

i.e. the Verilog’s register transfer level (RTL) description

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!