Bibliography[1] Nvidia CUDA (2009-09-30). http://www.nvidia.com/cuda.[2] Cyth<strong>on</strong>:C-Extensi<strong>on</strong>s <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> (2009-09-30). http://www.cyth<strong>on</strong>.org.[3] OpenCL - The open standard <str<strong>on</strong>g>for</str<strong>on</strong>g> parallel programming <str<strong>on</strong>g>of</str<strong>on</strong>g> heterogeneous systems (2009-09-30). http://www.khr<strong>on</strong>os.org/opencl/.[4] PyPy project (2009-09-30). http://codespeak.net/pypy/dist/pypy/doc/.[5] Pyrex - a Language <str<strong>on</strong>g>for</str<strong>on</strong>g> Writing <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> Extensi<strong>on</strong> Modules (2009-09-30). http://www.cosc.canterbury.ac.nz/greg.ewing/pyth<strong>on</strong>/Pyrex/.[6] <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g>/C API Reference Manual (2009-09-30). http://www.pyth<strong>on</strong>.org/doc/2.5/api/api.html.[7] Unladen Swallow: A faster implementati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> (2009-09-30). http://code.google.com/p/unladen-swallow.[8] AMD. ATI Stream Computing User Guide, February 2009.[9] AMD. Compute Abstracti<strong>on</strong> Layer (CAL) Intermediate Language (IL) Reference Guide,February 2009.[10] AMD. R700-Family Instructi<strong>on</strong> Set Architecture, March 2009.[11] D. Anc<strong>on</strong>a, M. Anc<strong>on</strong>a, A Cuni, and N. Matsakis. R<str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g>: A Step Towards Rec<strong>on</strong>cilingDynamically and Statically Typed OO Languages. In OOPSLA 2007 Proceedingsand Compani<strong>on</strong>, DLS’07: Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the 2007 Symposium <strong>on</strong> Dynamic Languages,pages 53–64. ACM, 2007.[12] Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayv<strong>on</strong> Fatahalian, Mike Houst<strong>on</strong>,and Pat Hanrahan. Brook <str<strong>on</strong>g>for</str<strong>on</strong>g> GPUs: Stream Computing <strong>on</strong> Graphics Hardware.ACM Trans. Graph., 23(3):777–786, 2004.[13] Mark Dufour. Shed Skin - An experimental (restricted) <str<strong>on</strong>g>Pyth<strong>on</strong></str<strong>on</strong>g> to C++ compiler(2009-09-30). http://code.google.com/p/shedskin/.66
[14] Francois Lab<strong>on</strong>te, Peter Matts<strong>on</strong>, William Thies, Ian Buck, Christos Kozyrakis, andMark Horowitz. The Stream Virtual Machine. In PACT ’04: Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the 13thInternati<strong>on</strong>al C<strong>on</strong>ference <strong>on</strong> <str<strong>on</strong>g>Parallel</str<strong>on</strong>g> Architectures and Compilati<strong>on</strong> Techniques, pages267–277. IEEE Computer Society, 2004.[15] Sey<strong>on</strong>g Lee, Seung-Jai Min, and Rudolf Eigenmann. OpenMP to GPGPU: a compilerframework <str<strong>on</strong>g>for</str<strong>on</strong>g> automatic translati<strong>on</strong> and optimizati<strong>on</strong>. In PPoPP ’09: Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g>the 14th ACM SIGPLAN symposium <strong>on</strong> Principles and practice <str<strong>on</strong>g>of</str<strong>on</strong>g> parallel programming,pages 101–110. ACM, 2009.[16] Michael D. McCool, Zheng Qin, and Tiberiu S. Popa. Shader metaprogramming. InHWWS ’02: Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the ACM SIGGRAPH/EUROGRAPHICS c<strong>on</strong>ference <strong>on</strong>Graphics hardware, pages 57–68. Eurographics Associati<strong>on</strong>, 2002.[17] Yunheung Paek, Jay Hoeflinger, and David Padua. Efficient and precise array accessanalysis. ACM Trans. Program. Lang. Syst., 24(1):65–109, 2002.[18] Silvius Rus, Lawrence Rauchwerger, and Jay Hoeflinger. Run-time Assisted InterproceduralAnalysis <str<strong>on</strong>g>of</str<strong>on</strong>g> Memory Access Patterns. Technical report, Department <str<strong>on</strong>g>of</str<strong>on</strong>g> ComputerScience, Texas A&M University, 2001.[19] Silvius Rus, Lawrence Rauchwerger, and Jay Hoeflinger. Hybrid analysis: static &dynamic memory reference analysis. Internati<strong>on</strong>al Journal <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Parallel</str<strong>on</strong>g> Programming,31(4):251–283, 2003.[20] John Stratt<strong>on</strong>, Sam St<strong>on</strong>e, and Wen mei Hwu. MCUDA: An Efficient Implementati<strong>on</strong><str<strong>on</strong>g>of</str<strong>on</strong>g> CUDA Kernels <str<strong>on</strong>g>for</str<strong>on</strong>g> Multi-core CPUs. In 21st Annual Workshop <strong>on</strong> Languages and<str<strong>on</strong>g>Compiler</str<strong>on</strong>g>s <str<strong>on</strong>g>for</str<strong>on</strong>g> <str<strong>on</strong>g>Parallel</str<strong>on</strong>g> Computing (LCPC), July 2008.[21] William Thies, Michael Karczmarek, and Saman Amarasinghe. StreamIt: A Language<str<strong>on</strong>g>for</str<strong>on</strong>g> Streaming Applicati<strong>on</strong>s. In Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the Internati<strong>on</strong>al C<strong>on</strong>ference <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Compiler</str<strong>on</strong>g>C<strong>on</strong>structi<strong>on</strong>, 2002.[22] A. Udupa, R. Govindarajan, and M.J Thazhuthaveetil. S<str<strong>on</strong>g>of</str<strong>on</strong>g>tware Pipelined Executi<strong>on</strong><str<strong>on</strong>g>of</str<strong>on</strong>g> Stream <str<strong>on</strong>g>Programs</str<strong>on</strong>g> <strong>on</strong> GPUs. In Internati<strong>on</strong>al Symposium <strong>on</strong> Code Generati<strong>on</strong> andOptimizati<strong>on</strong> (CGO), pages 200–209, 2009.[23] Perry H. Wang, Jamis<strong>on</strong> D. Collins, Gautham N. Chinya, H<strong>on</strong>g Jiang, Xinmin Tian,Milind Girkar, Nick Y. Yang, Guei-Yuan Lueh, and H<strong>on</strong>g Wang. EXOCHI: architectureand programming envir<strong>on</strong>ment <str<strong>on</strong>g>for</str<strong>on</strong>g> a heterogeneous multi-core multithreaded system.In PLDI ’07: Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the 2007 ACM SIGPLAN c<strong>on</strong>ference <strong>on</strong> Programminglanguage design and implementati<strong>on</strong>, pages 156–166. ACM, 2007.67