Parallelizing the Construction of Static Single Assignment Form

More documents

Recommendations

Info

app # methods (% change) mean length (% change) max length montecarlo 179 (+1%) 20.82 (+18%) 231 raytracer 69 (+6%) 40.31 (+32%) 406 search 32 (+10%) 89.87 (+4%) 484 Table 3: Method statistics for selected benchmarks with inlining applied even in cases where the program was originally written as a collection of small functions. 2. Another trend is the increasing amount of source code that is produced by automatic code generators. more source code is produced by code generators. Such programs do not necessarily have the same properties as human written code. Methods are often larger, and loops deeper. 3. There is an increasing need for interprocedural data flow analysis. In the context of this paper, the SSA construction operates at a single-procedure level. However SSA does extend naturally to the whole-program scope [19, 29]. Again, our parallel SSA construction techniques would be applicable to such large interprocedural control flow graphs. Exploiting all the available parallel hardware for the analysis of such large SSA-based program representations is therefore of great importance in reducing compile times. In overall terms, the largest single contributor to compilation time is parsing. Researchers are working on parallelizing the parsing process [24]. We anticipate that every significant phase of next-generation compilers should be engineered to take advantage of underlying hardware parallelism, to achieve scalability. This includes the data flow analysis phase, of which SSA construction is generally a key initial step. One might argue that it would be better to perform sequential (i.e. single-threaded) SSA-construction on multiple methods in parallel, rather than parallelizing SSA construction for a single method. However we note that in a JIT compiler or an interactive compilation server [16], requests for optimizing compilation of methods will occur sporadically, unpredictably and not necessarily concurrently. Where possible, it would be good to parallelize both individual method SSA construction, and handle multiple methods concurrently. We have only tackled the former problem in this paper. Also, in a JIT context, response time is often more important than total performance: so reducing the analysis time for the largest methods will improve worst case response time, even if it does not improve performance on the small methods (where performance is already adequate). 7. REFERENCES [1] Ashby, S., Eulisse, G., Schmid, S., Tuura, L.: Parallel compilation of CMS software. In: Proc. Computing in High Energy and Nuclear Physics Conference (CHEP) (2004) [2] Baalbergen, E.: Design and implementation of parallel make. Computing Systems 1(2), 135–158 (1988) [3] Bilardi, G., Pingali, K.: Algorithms for computing the static single assignment form. Journal of the ACM 50(3), 375–425 (May 2003) [4] Blackburn, S.M., Garner, R., Hoffman, C., Khan, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., Wiedermann, B.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA ’06: Proceedings of the 21st annual ACM SIGPLAN conference on Object-Oriented Programing, Systems, Languages, and Applications. pp. 169–190 (Oct 2006) [5] Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. In: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming. pp. 207–216. ACM, New York, NY, USA (1995) [6] Brandis, M.M., Mössenböck, H.: Single-pass generation of static single-assignment form for structured languages. ACM Transactions on Programming Languages and Systems 16(6), 1684–1698 (Nov 1994) [7] Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, F.K.: Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13(4), 451–490 (1991) [8] Das, D., Ramakrishna, U.: A practical and fast iterative algorithm for φ-function computation using DJ graphs. ACM Transactions on Programming Languages and Systems 27(3), 426–440 (2005) [9] Edvinsson, M., Lowe, W.: A multi-threaded approach for data-flow analysis. In: Proceedings of the IPDPS 2010 Workshop on Multi-Threaded Architectures and Applications (2010) [10] Hill, M., Marty, M.: Amdahl’s law in the multicore era. IEEE Computer 41(7), 33–38 (2008) [11] Knoop, J.: Data-flow analysis for multi-core computing systems: A reminder to reverse data-flow analysis. In: Martin, F., Nielson, H.R., Riva, C., Schordan, M. (eds.) Scalable Program Analysis. No. 08161 in Dagstuhl Seminar Proceedings, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany, Dagstuhl, Germany (2008), http: //drops.dagstuhl.de/opus/volltexte/2008/1575 [12] Knuth, D.: An empirical study of FORTRAN programs. Software: Practice and Experience 1(2), 105–133 (1971) [13] Kotzmann, T., Wimmer, C., Mössenböck, H., Rodriguez, T., Russell, K., Cox, D.: Design of the Java HotSpot client compiler for Java 6. ACM Trans. Archit. Code Optim. 5(1), 1–32 (2008) [14] Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis and
transformation. In: Code Generation and Optimization, IEEE/ACM International Symposium on. p. 75. IEEE Computer Society, Los Alamitos, CA, USA (2004) [15] Lea, D.: A Java fork/join framework. In: Proceedings of the ACM 2000 conference on Java Grande. pp. 36–43. ACM, New York, NY, USA (2000) [16] Lee, H.B., Diwan, A., Moss, J.E.B.: Design, implementation, and evaluation of a compilation server. ACM Transactions on Programming Languages and Systems 29 (2007) [17] Lee, Y.f., Marlowe, T.J., Ryder, B.G.: Performing data flow analysis in parallel. In: Supercomputing ’90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing. pp. 942–951 (1990) [18] Lee, Y.F., Ryder, B.G.: A comprehensive approach to parallel data flow analysis. In: Proceedings of the 6th international conference on Supercomputing. pp. 236–247 (1992) [19] Liao, S.W., Diwan, A., Bosch, Jr., R.P., Ghuloum, A., Lam, M.S.: SUIF Explorer: an interactive and interprocedural parallelizer. In: Proceedings of the 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. pp. 37–48 (1999) [20] McCabe, T.: A complexity measure. IEEE Transactions on Software Engineering 2, 308–320 (1976) [21] Méndez-Lojo, M., Mathew, A., Pingali, K.: Parallel inclusion-based points-to analysis. In: Proceedings of the ACM international conference on Object Oriented Programming Systems Languages and Applications. pp. 428–443 (2010) [22] Novillo, D.: Tree SSA a new optimization infrastructure for gcc. In: Proceedings of the 2003 GCC Developersâ Ă´ Z Summit. pp. 181–193 (2003) [23] Novillo, D.: Design and implementation of Tree SSA. In: GCC Developersâ Ă´ Z Summit (2004) [24] Prabhu, P., Ramalingam, G., Vaswani, K.: Safe programmable speculative parallelism. In: PLDI ’10: Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation. pp. 50–61 (2010) [25] Rodriguez, J., Lhoták, O.: Actor-based parallel dataflow analysis. In: Proceedings of the International Conference in Compiler Construction (2011), to appear [26] Rodriguez, J.D.: A concurrent ifds dataflow analysis algorithm using actors (2010), http://hdl.handle.net/10012/5283 [27] Smith, L., Bull, J., Obdrizalek, J.: A parallel java grande benchmark suite. In: Proceedings of the ACM/IEEE 2001 Conference on Supercomputing. pp. 1–10 (2001) [28] Sreedhar, V.C., Gao, G.R.: A linear time algorithm for placing φ-nodes. In: Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. pp. 62–73 (1995) [29] Staiger, S., Vogel, G., Keul, S., Wiebe, E.: Interprocedural Static Single Assignment Form. In: Proceedings of the 14th Working Conference on Reverse Engineering. pp. 1–10 (2007) [30] Suganuma, T., Yasue, T., Nakatani, T.: An empirical study of method inlining for a Java just-in-time compiler. In: Proceedings of the Java Virtual Machine Research and Technology Symposium (2002) [31] Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot—a Java bytecode optimization framework. In: Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research. p. 13. IBM Press (1999) [32] Vallée-Rai, R., Gagnon, E., Hendren, L., Lam, P., Pominville, P., Sundaresan, V.: Optimizing Java bytecode using the Soot framework: Is it feasible? In: Proceedings of the International Conference on Compiler Construction. pp. 18–34. Springer (2000) [33] Vallee-Rai, R., Hendren, L.: Jimple: Simplifying Java bytecode for analyses and transformations. Tech. Rep. SABLE-TR-1998-4, McGill University, School of Computer Science (1998)
Page 1 and 2: ABSTRACT Parallelizing the Construc
Page 3 and 4: Algorithm 2 Amendment for lazy φ-f
Page 5 and 6: (e.g. the control flow graph and ba
Page 7: speedup 1.6 1.4 1.2 1 0.8 0.6 0.4 0

Parallelizing the Construction of Static Single Assignment Form

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?