01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Autonomic Workload Management<br />

for Multi-core Processor <strong>Systems</strong><br />

Johannes Zeppenfeld and Andreas Herkersdorf<br />

Technische Universität München<br />

{zeppenfe,herkersdorf}@tum.de<br />

Abstract. This paper presents the use <strong>of</strong> decentralized self-organization concepts<br />

for the efficient dynamic parameterization <strong>of</strong> hardware components and<br />

the autonomic distribution <strong>of</strong> tasks <strong>in</strong> a symmetrical multi-core processor<br />

system. Us<strong>in</strong>g results obta<strong>in</strong>ed with an autonomic system on chip simulation<br />

model, we show that Learn<strong>in</strong>g Classifier Tables, a simplified XCS-based<br />

re<strong>in</strong>forcement learn<strong>in</strong>g technique optimized for a low-overhead hardware implementation<br />

and <strong>in</strong>tegration, achieves nearly optimal results for dynamic workload<br />

balanc<strong>in</strong>g dur<strong>in</strong>g run time for a standard network<strong>in</strong>g application at task<br />

level. Further <strong>in</strong>vestigations show the quantitative differences <strong>in</strong> optimization<br />

quality between scenarios when local and global system <strong>in</strong>formation is <strong>in</strong>cluded<br />

<strong>in</strong> the classifier rules. Autonomic workload management or task repartition<strong>in</strong>g<br />

at run time relieves the s<strong>of</strong>tware application developers from explor<strong>in</strong>g this NPhard<br />

problem dur<strong>in</strong>g design time, and is able to react to dynamic changes <strong>in</strong> the<br />

MP-SoC operat<strong>in</strong>g environment.<br />

1 Introduction<br />

1.1 Multi-core Process<strong>in</strong>g<br />

S<strong>in</strong>gle-chip multi-processors became the ma<strong>in</strong>stream architecture template for advanced<br />

microprocessor designs across a wide spectrum <strong>of</strong> application doma<strong>in</strong>s. Intel,<br />

the market leader for general purpose comput<strong>in</strong>g, abandoned its traditional strategy to<br />

primarily scale microprocessor performance through cont<strong>in</strong>ued <strong>in</strong>crease <strong>in</strong> core clock<br />

frequency and <strong>in</strong>troduced parallel dual-, quad- and recently octal-core (Nehalem)<br />

Xeon processors [1] <strong>in</strong>stead. In their research labs, Intel <strong>in</strong>tegrated the 80 core Tera-<br />

Scale processor [2] with a 2D network on chip and sophisticated 3D memory stack<br />

access. SUN Niagara, ARM MPCore, IBM Cell Broadband Eng<strong>in</strong>e, Nvidia GeForce,<br />

the CSX700 from ClearSpeed, and the TILE64 from Tilera are other examples <strong>of</strong> a<br />

non-exhaustive list <strong>of</strong> massively parallel multi-core architectures for use <strong>in</strong> mobile<br />

communications, graphics process<strong>in</strong>g, gam<strong>in</strong>g, <strong>in</strong>dustrial automation and highperformance<br />

scientific comput<strong>in</strong>g. While progress <strong>in</strong> deep sub-micron CMOS technology<br />

<strong>in</strong>tegration enabled the physical realization <strong>of</strong> this vast amount <strong>of</strong> nom<strong>in</strong>al<br />

process<strong>in</strong>g capacity on a s<strong>in</strong>gle chip, the application programmer community – across<br />

all <strong>of</strong> the above mentioned systems – is m<strong>in</strong>imally supported by tools and methods to<br />

efficiently exploit the available parallel resources [3]. We regard this circumstance a<br />

major challenge for a fast and efficient adoption <strong>of</strong> multi-core processors.<br />

C. Müller-Schloer, W. Karl, and S. Yehia (Eds.): ARCS 2010, LNCS 5974, pp. 49–60, 2010.<br />

© Spr<strong>in</strong>ger-Verlag Berl<strong>in</strong> Heidelberg 2010

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!