210 R. Plyask<strong>in</strong> and A. Herkersdorf References 1. Agost<strong>in</strong>i, L.V., Silva, I.S., Bampi, S.: Pipel<strong>in</strong>ed Fast 2-D DCT <strong>Architecture</strong> for JPEG Image Compression. In: SBCCI 2001: Proceed<strong>in</strong>gs <strong>of</strong> the 14th symposium on Integrated circuits and systems design, Wash<strong>in</strong>gton, DC, USA, p. 226. IEEE <strong>Computer</strong> Society, Los Alamitos (2001) 2. Ben<strong>in</strong>i, L., Bertozzi, D., Bogliolo, A., Menichelli, F., Olivieri, M.: MPARM: Explor<strong>in</strong>g the Multi-Processor SoC Design Space with SystemC. J. VLSI Signal Process. Syst. 41(2), 169–182 (2005) 3. Embedded JPEG Codec Library, http://sourceforge.net/projects/mbjpeg 4. Formaggio, L., Fummi, F., Pravadelli, G.: A Tim<strong>in</strong>g-Accurate HW/SW Co- Simulation <strong>of</strong> an ISS with SystemC. In: International Conference on Hardware/S<strong>of</strong>tware Codesign and System Synthesis, CODES + ISSS 2004, pp. 152–157 (2004) 5. Guthaus, M., R<strong>in</strong>genberg, J., Ernst, D., Aust<strong>in</strong>, T., Mudge, T., Brown, R.: MiBench: A Free, Commercially Representative Embedded Benchmark Suite. In: Workload Characterization, Annual IEEE International Workshop, pp. 3–14 (2001) 6. Kempf, T., Karuri, K., Wallentowitz, S., Ascheid, G., Leupers, R., Meyr, H.: A SW Performance Estimation Framework for Early System-Level-Design Us<strong>in</strong>g F<strong>in</strong>e- Gra<strong>in</strong>ed Instrumentation. In: DATE 2006: Proceed<strong>in</strong>gs <strong>of</strong> the conference on Design, automation and test <strong>in</strong> Europe, Leuven, Belgium, pp. 468–473. European Design and Automation Association (2006) 7. Lahiri, K., Raghunathan, A., Dey, S.: System-Level Performance Analysis for Design<strong>in</strong>g On-Chip Communication <strong>Architecture</strong>s. IEEE Transactions on <strong>Computer</strong>- Aided Design <strong>of</strong> Integrated Circuits and <strong>Systems</strong> 20(6), 768–783 (2001) 8. Lieverse, P., Stefanov, T., van der Wolf, P., Deprettere, E.: System Level Design with Spade: An M-JPEG Case Study. In: IEEE/ACM International Conference on <strong>Computer</strong> Aided Design, ICCAD 2001, pp. 31–38 (2001) 9. Mahadevan, S., Angiol<strong>in</strong>i, F., Sparso, J., Ben<strong>in</strong>i, L., Madsen, J.: A Reactive and Cycle-True IP Emulator for MPSoC Exploration. IEEE Transactions on <strong>Computer</strong>- Aided Design <strong>of</strong> Integrated Circuits and <strong>Systems</strong> 27(1), 109–122 (2008) 10. Pimentel, A.D., Thompson, M., Polstra, S., Erbas, C.: Calibration <strong>of</strong> Abstract Performance Models for System-Level Design Space Exploration. J. Signal Process. Syst. 50(2), 99–114 (2008) 11. Prete, C.A., Pr<strong>in</strong>a, G., Ricciardi, L.: A Trace-Driven Simulator for Performance Evaluation <strong>of</strong> Cache-Based Multiprocessor <strong>Systems</strong>. IEEE Trans. Parallel Distrib. Syst. 6(9), 915–929 (1995) 12. Schnerr, J., Br<strong>in</strong>gmann, O., Viehl, A., Rosenstiel, W.: High-Performance Tim<strong>in</strong>g Simulation <strong>of</strong> Embedded S<strong>of</strong>tware. In: DAC 2008: Proceed<strong>in</strong>gs <strong>of</strong> the 45th annual Design Automation Conference, pp. 290–295. ACM, New York (2008) 13. Sherman, S.W., Browne, J.C.: Trace Driven Model<strong>in</strong>g: Review and Overview. In: ANSS 1973: Proceed<strong>in</strong>gs <strong>of</strong> the 1st symposium on Simulation <strong>of</strong> computer systems, pp. 200–207. IEEE Press, Los Alamitos (1973) 14. VaST <strong>Systems</strong> Technology, http://www.vastsystems.com 15. Wild, T., Herkersdorf, A., Lee, G.-Y.: TAPES – Trace-Based <strong>Architecture</strong> Performance Evaluation with SystemC. Design Automation for Embedded <strong>Systems</strong> 10(2-3), 157–179 (2005)
JetBench: An Open Source Real-Time Multiprocessor Benchmark Muhammad Yasir Qadri 1 , Dorian Matichard 2 , and Klaus D. McDonald Maier 1 1 School <strong>of</strong> <strong>Computer</strong> Science and Electronic Eng<strong>in</strong>eer<strong>in</strong>g University <strong>of</strong> Essex, CO4 3SQ, UK yasirqadri@acm.org, kdm@essex.ac.uk 2 Ecole Nationale d’Electronique, Informatique et Radiocommunications de Bordeaux, ENSEIRB matichar@enseirb.fr Abstract. Performance comparison among various architectures is generally atta<strong>in</strong>ed by us<strong>in</strong>g standard benchmark tools. This paper presents JetBench, an Open Source OpenMP based multicore benchmark application that could be used to analyse real time performance <strong>of</strong> a specific target platform. The application is designed to be platform <strong>in</strong>dependent by avoid<strong>in</strong>g target specific libraries and hardware counters and timers. JetBench uses jet eng<strong>in</strong>e parameters and thermodynamic equations presented <strong>in</strong> the NASA’s Eng<strong>in</strong>eSim program, and emulates a real-time jet eng<strong>in</strong>e performance calculator. The user is allowed to determ<strong>in</strong>e a flight pr<strong>of</strong>ile with tim<strong>in</strong>g constra<strong>in</strong>ts, and adjust the number <strong>of</strong> threads. This paper discusses the structure <strong>of</strong> the application, thread distribution and its scalability on a custom symmetric multicore platform based on a cycle accurate full system simulator. Keywords: Real-time, Multiprocessor, Application Benchmark. 1 Introduction Benchmarks are generally classified <strong>in</strong>to two types, i.e. 1) synthetic benchmarks and 2) application benchmarks. Synthetic benchmarks are designed to exploit particular property <strong>of</strong> a processor such as <strong>in</strong>struction per second (IPS), cache performance, I/O bandwidth etc, whereas application benchmarks are centred towards one particular application such as automotive, <strong>of</strong>fice automation, etc. The concept <strong>of</strong> us<strong>in</strong>g benchmarks for performance characterization <strong>of</strong> the system is common practice and some processor manufacturers have proposed their own benchmarks [1]. However such benchmarks strive to give better performance on a particular platform, third party benchmarks are a good way to compare the performance amongst various architectures impartially and transparently. The JetBench benchmark presented <strong>in</strong> this paper is an application benchmark written <strong>in</strong> C, for real-time jet eng<strong>in</strong>es thermodynamic calculations. It is a multithreaded application for shared memory architectures. The benchmark is based on OpenMP [2], and could be seamlessly ported to any platform support<strong>in</strong>g it. The benchmark provides user the flexibility to specify custom workload, that could be a real flight C. Müller-Schloer, W. Karl, and S. Yehia (Eds.): ARCS 2010, LNCS 5974, pp. 211–221, 2010. © Spr<strong>in</strong>ger-Verlag Berl<strong>in</strong> Heidelberg 2010
- Page 2 and 3:
Lecture Notes in Computer Science 5
- Page 4 and 5:
Volume Editors Christian Müller-Sc
- Page 6 and 7:
General Chair Organization Christia
- Page 8 and 9:
Organization IX Hartmut Schmeck Kar
- Page 10 and 11:
Keynote Table of Contents HyVM - Hy
- Page 12 and 13:
Table of Contents XIII JetBench:AnO
- Page 14 and 15:
How to Enhance a Superscalar Proces
- Page 16 and 17:
4 J. Mische et al. The Real-time Vi
- Page 18 and 19:
6 J. Mische et al. 4.1 Instruction
- Page 20 and 21:
8 J. Mische et al. Additionally the
- Page 22 and 23:
10 J. Mische et al. % of unused pip
- Page 24 and 25:
12 J. Mische et al. % of cycles spe
- Page 26 and 27:
14 J. Mische et al. 16. Lickly, B.,
- Page 28 and 29:
16 G. Aşılıoğlu, E.M. Kaya, and
- Page 30 and 31:
18 G. Aşılıoğlu, E.M. Kaya, and
- Page 32 and 33:
20 G. Aşılıoğlu, E.M. Kaya, and
- Page 34 and 35:
22 G. Aşılıoğlu, E.M. Kaya, and
- Page 36 and 37:
24 G. Aşılıoğlu, E.M. Kaya, and
- Page 38 and 39:
26 T.B. Preußer, P. Reichel, and R
- Page 40 and 41:
28 T.B. Preußer, P. Reichel, and R
- Page 42 and 43:
30 T.B. Preußer, P. Reichel, and R
- Page 44 and 45:
32 T.B. Preußer, P. Reichel, and R
- Page 46 and 47:
34 T.B. Preußer, P. Reichel, and R
- Page 48 and 49:
36 T.B. Preußer, P. Reichel, and R
- Page 50 and 51:
38 P. Bellasi, W. Fornaciari, and D
- Page 52 and 53:
40 P. Bellasi, W. Fornaciari, and D
- Page 54 and 55:
42 P. Bellasi, W. Fornaciari, and D
- Page 56 and 57:
44 P. Bellasi, W. Fornaciari, and D
- Page 58 and 59:
46 P. Bellasi, W. Fornaciari, and D
- Page 60 and 61:
48 P. Bellasi, W. Fornaciari, and D
- Page 62 and 63:
50 J. Zeppenfeld and A. Herkersdorf
- Page 64 and 65:
52 J. Zeppenfeld and A. Herkersdorf
- Page 66 and 67:
54 J. Zeppenfeld and A. Herkersdorf
- Page 68 and 69:
56 J. Zeppenfeld and A. Herkersdorf
- Page 70 and 71:
58 J. Zeppenfeld and A. Herkersdorf
- Page 72 and 73:
60 J. Zeppenfeld and A. Herkersdorf
- Page 74 and 75:
62 B. Jakimovski, B. Meyer, and E.
- Page 76 and 77:
64 B. Jakimovski, B. Meyer, and E.
- Page 78 and 79:
66 B. Jakimovski, B. Meyer, and E.
- Page 80 and 81:
68 B. Jakimovski, B. Meyer, and E.
- Page 82 and 83:
70 B. Jakimovski, B. Meyer, and E.
- Page 84 and 85:
72 B. Jakimovski, B. Meyer, and E.
- Page 86 and 87:
74 M. Bonn and H. Schmeck Fig. 1. J
- Page 88 and 89:
76 M. Bonn and H. Schmeck 2.2 Node
- Page 90 and 91:
78 M. Bonn and H. Schmeck Uptime-ba
- Page 92 and 93:
80 M. Bonn and H. Schmeck 2.4 Simul
- Page 94 and 95:
82 M. Bonn and H. Schmeck done rate
- Page 96 and 97:
84 M. Bonn and H. Schmeck Fig. 8. J
- Page 98 and 99:
86 M. Bonn and H. Schmeck tells the
- Page 100 and 101:
88 J.-P. Steghöfer et al. � �
- Page 102 and 103:
90 J.-P. Steghöfer et al. mechanis
- Page 104 and 105:
92 J.-P. Steghöfer et al. resource
- Page 106 and 107:
94 J.-P. Steghöfer et al. Choose a
- Page 108 and 109:
96 J.-P. Steghöfer et al. 1. Defin
- Page 110 and 111:
98 J.-P. Steghöfer et al. and all
- Page 112 and 113:
100 J.-P. Steghöfer et al. 19. Kim
- Page 114 and 115:
102 K. Kloch et al. large-scale sys
- Page 116 and 117:
104 K. Kloch et al. a�t� 1.0 0.
- Page 118 and 119:
106 K. Kloch et al. constant. This
- Page 120 and 121:
108 K. Kloch et al. Relative number
- Page 122 and 123:
110 K. Kloch et al. (a) infection r
- Page 124 and 125:
112 K. Kloch et al. (ii) Phase of a
- Page 126 and 127:
114 P. Petoumenos et al. Studying t
- Page 128 and 129:
116 P. Petoumenos et al. % of Misse
- Page 130 and 131:
118 P. Petoumenos et al. IQ: n 4-in
- Page 132 and 133:
120 P. Petoumenos et al. downsizing
- Page 134 and 135:
122 P. Petoumenos et al. As long as
- Page 136 and 137:
124 P. Petoumenos et al. comparable
- Page 138 and 139:
Exploiting Inactive Rename Slots fo
- Page 140 and 141:
128 M. Kayaalp et al. In a supersca
- Page 142 and 143:
130 M. Kayaalp et al. INSTRUCTION 1
- Page 144 and 145:
132 M. Kayaalp et al. time. Alterna
- Page 146 and 147:
134 M. Kayaalp et al. Fig. 3. Numbe
- Page 148 and 149:
136 M. Kayaalp et al. The results o
- Page 150 and 151:
Efficient Transaction Nesting in Ha
- Page 152 and 153:
140 Y. Liu et al. HTMs include TCC
- Page 154 and 155:
142 Y. Liu et al. rollback T0 begin
- Page 156 and 157:
144 Y. Liu et al. Processor core Pr
- Page 158 and 159:
146 Y. Liu et al. 5.2 Results and A
- Page 160 and 161:
148 Y. Liu et al. decreases, partia
- Page 162 and 163:
Decentralized Energy-Management to
- Page 164 and 165:
152 B. Becker et al. Furthermore, t
- Page 166 and 167:
154 B. Becker et al. An optimizing
- Page 168 and 169:
156 B. Becker et al. smart-home man
- Page 170 and 171:
158 B. Becker et al. freedom like w
- Page 172 and 173: 160 B. Becker et al. Power [W] 4500
- Page 174 and 175: EnergySaving Cluster Roll: Power Sa
- Page 176 and 177: 164 M.F. Dolz et al. Themodulequeri
- Page 178 and 179: 166 M.F. Dolz et al. This daemon al
- Page 180 and 181: 168 M.F. Dolz et al. been submitted
- Page 182 and 183: 170 M.F. Dolz et al. On the other h
- Page 184 and 185: 172 M.F. Dolz et al. at the inactiv
- Page 186 and 187: Effect of the Degree of Neighborhoo
- Page 188 and 189: 176 T. Abdullah et al. A zone based
- Page 190 and 191: 178 T. Abdullah et al. A consumer/p
- Page 192 and 193: 180 T. Abdullah et al. Messages 140
- Page 194 and 195: 182 T. Abdullah et al. (except when
- Page 196 and 197: 184 T. Abdullah et al. % Matchmakin
- Page 198 and 199: 186 T. Abdullah et al. show that th
- Page 200 and 201: 188 M. Schindewolf, D. Kramer, and
- Page 202 and 203: 190 M. Schindewolf, D. Kramer, and
- Page 204 and 205: 192 M. Schindewolf, D. Kramer, and
- Page 206 and 207: 194 M. Schindewolf, D. Kramer, and
- Page 208 and 209: 196 M. Schindewolf, D. Kramer, and
- Page 210 and 211: 198 M. Schindewolf, D. Kramer, and
- Page 212 and 213: 200 R. Plyaskin and A. Herkersdorf
- Page 214 and 215: 202 R. Plyaskin and A. Herkersdorf
- Page 216 and 217: 204 R. Plyaskin and A. Herkersdorf
- Page 218 and 219: 206 R. Plyaskin and A. Herkersdorf
- Page 220 and 221: 208 R. Plyaskin and A. Herkersdorf
- Page 224 and 225: 212 M.Y. Qadri, D. Matichard, and K
- Page 226 and 227: 214 M.Y. Qadri, D. Matichard, and K
- Page 228 and 229: 216 M.Y. Qadri, D. Matichard, and K
- Page 230 and 231: 218 M.Y. Qadri, D. Matichard, and K
- Page 232 and 233: 220 M.Y. Qadri, D. Matichard, and K
- Page 234 and 235: A Tightly Coupled Accelerator Infra
- Page 236 and 237: 224 F. Nowak and R. Buchty where A
- Page 238 and 239: 226 F. Nowak and R. Buchty Fig. 3.
- Page 240 and 241: 228 F. Nowak and R. Buchty Table 2.
- Page 242 and 243: 230 F. Nowak and R. Buchty Table 4.
- Page 244 and 245: 232 F. Nowak and R. Buchty 5.3 Comp
- Page 246 and 247: Optimizing Stencil Application on M
- Page 248 and 249: 236 F. Xudong et al. compared to th
- Page 250 and 251: 238 F. Xudong et al. stencil comput
- Page 252 and 253: 240 F. Xudong et al. threads in the
- Page 254 and 255: 242 F. Xudong et al. Speedup 14 12
- Page 256 and 257: 244 F. Xudong et al. 5 Related Work
- Page 258: Abdullah, Tariq 174 Alima, Luc Onan