15.11.2012 Views

Parallelized Critical Path Search in Electrical Circuit Designs

Parallelized Critical Path Search in Electrical Circuit Designs

Parallelized Critical Path Search in Electrical Circuit Designs

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Figure 6. Runtime of the reachable function<br />

compared to standard Dijkstra algorithm.<br />

that is presented <strong>in</strong> Figure 7, is based on this implementation.<br />

In Figure 7, the ga<strong>in</strong>ed average speed-up <strong>in</strong> the runtime<br />

is 15 seconds on average. Due to the limitation that only operators<br />

are flagged with the arcs, this value oscillates with<br />

the way how the design under test is implemented. For example,<br />

if long cha<strong>in</strong>s of operators are used for the design<br />

than a higher speed-up can be expected.<br />

6. Related Work<br />

Some articles related to dynamic partition<strong>in</strong>g penned by<br />

Walshaw et al. [21], D<strong>in</strong>iz et al. [6] and Lohner et al. [14]<br />

discuss parallel algorithms that dynamically partition unstructured<br />

grids or mesh networks for load balanc<strong>in</strong>g which<br />

is somehow related to graph partition<strong>in</strong>g. All of them try to<br />

improve the performance on multi-core systems.<br />

To handle search <strong>in</strong> large graphs, the memory of the<br />

mach<strong>in</strong>e has to be taken <strong>in</strong>to account. A special graph<br />

partion<strong>in</strong>g algorithm us<strong>in</strong>g hMetis partition<strong>in</strong>g is proposed<br />

by [11]. An approach adapted and optimized for the Blue-<br />

Gene/L system is the scalable parallel breadth-first search<br />

algorithm [20]. However, this algorithm is limited to Poisson<br />

random graphs.<br />

7. Conclusion and Future Work<br />

Partition<strong>in</strong>g of large graph data is a compute-<strong>in</strong>tensive<br />

task. However, once the partition<strong>in</strong>g is done, succeed<strong>in</strong>g<br />

shortest path queries can be performed reasonably fast.<br />

Comb<strong>in</strong>ed with preprocess<strong>in</strong>g and the Arc-Flag approach,<br />

the response time can be further reduced.<br />

Figure 7. Runtime of the Arc-Flag approach<br />

compared to standard Dijkstra algorithm.<br />

For achiev<strong>in</strong>g a significant speed-up, the comb<strong>in</strong>ation of<br />

various methods is useful. First, the graph need to be partitioned<br />

us<strong>in</strong>g one of the static partition<strong>in</strong>g algorithm <strong>in</strong>troduced.<br />

Once the partition<strong>in</strong>g is done, each partition needs<br />

to be preprocessed, and at all entry po<strong>in</strong>ts the “arcs” to all<br />

exit po<strong>in</strong>ts need to be annotated. To calculate all the arcs,<br />

it is required to perform an all-pair shortest path search.<br />

This path search can be comb<strong>in</strong>ed with the bidirectional<br />

approach to achieve better run time performance. This is<br />

because preprocess<strong>in</strong>g the partitioned graph can be parallelized<br />

and scales almost l<strong>in</strong>ear. F<strong>in</strong>ally, to be able to run<br />

arbitrary parallel algorithms on a graph and ga<strong>in</strong> a speedup,<br />

partition<strong>in</strong>g is the most promis<strong>in</strong>g way.<br />

As for future work, the suggested and implemented<br />

reachable function needs further analysis because the experimental<br />

result is much slower than expected. There is<br />

a good chance that the reachable function can be further<br />

improved for the use <strong>in</strong> an application. In addition, more<br />

experimental results collected by more prototypical implementations<br />

are needed to rate the performance and usability<br />

of the various presented speed-up techniques.<br />

Acknowledgment<br />

This paper is funded by the Federal M<strong>in</strong>istry of Education<br />

and Research (BMBF) project, “Hardware Design<br />

Techniques for Zero Defect <strong>Designs</strong>” (HERKULES), grant<br />

number 01M3082.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!