01.12.2012 Views

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

Architecture of Computing Systems (Lecture Notes in Computer ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Autonomic Workload Management for Multi-core Processor <strong>Systems</strong> 53<br />

complex task partition<strong>in</strong>g among cores. RTC is easily scalable, but relies on no packet<br />

data shar<strong>in</strong>g be<strong>in</strong>g required, and that the nature <strong>of</strong> the application leans towards<br />

event / packet balanc<strong>in</strong>g. Pipel<strong>in</strong><strong>in</strong>g requires equal workload shar<strong>in</strong>g <strong>of</strong> tasks among<br />

process<strong>in</strong>g elements <strong>in</strong> order to achieve optimized throughput and efficiency (avoid<strong>in</strong>g<br />

pipel<strong>in</strong>e “bubbles”). On the positive side, the pipel<strong>in</strong><strong>in</strong>g model is applicable to a<br />

larger class <strong>of</strong> parallel applications, achieves a smaller <strong>in</strong>struction footpr<strong>in</strong>t and supports<br />

local state and data cach<strong>in</strong>g.<br />

2.3 Autonomic Layer<br />

The autonomic layer used for this paper’s simulations conta<strong>in</strong>s an autonomic element<br />

for each <strong>of</strong> the system’s three CPUs. Autonomic elements for the bus, memory, MAC<br />

and <strong>in</strong>terrupt controller are not <strong>in</strong>cluded, as the CPUs are currently our primary target<br />

for optimization. However, AEs for the rema<strong>in</strong><strong>in</strong>g functional elements are be<strong>in</strong>g considered<br />

for future development.<br />

Along with an LCT evaluator, the CPU AEs conta<strong>in</strong> several monitors and actuators<br />

to <strong>in</strong>teract with the associated CPU FE. Each <strong>of</strong> these will be discussed <strong>in</strong> more detail<br />

<strong>in</strong> the follow<strong>in</strong>g sections.<br />

2.3.1 Monitors<br />

Two local monitors keep track <strong>of</strong> the current CPU frequency and utilization, and are<br />

multiplied together to produce a third monitor value – the CPU’s workload:<br />

LoadCPU = UtilCPU · FreqCPU (1)<br />

Whereas the utilization <strong>in</strong>dicates the percentage <strong>of</strong> cycles that the CPU is busy (i.e. is<br />

process<strong>in</strong>g a packet rather than wait<strong>in</strong>g <strong>in</strong> an idle loop), the workload <strong>in</strong>dicates the<br />

actual amount <strong>of</strong> work that the CPU is perform<strong>in</strong>g per unit <strong>of</strong> time (useful cycles per<br />

second). S<strong>in</strong>ce the result<strong>in</strong>g value is helpful <strong>in</strong> compar<strong>in</strong>g the amount <strong>of</strong> process<strong>in</strong>g<br />

power be<strong>in</strong>g contributed by each CPU, it is shared with the other AEs over the AE<br />

<strong>in</strong>terconnect. The workload <strong>in</strong>formation can then be averaged over all three CPUs,<br />

and by comparison with its own workload allows each CPU to determ<strong>in</strong>e whether it is<br />

perform<strong>in</strong>g more or less than its “fair share” <strong>of</strong> work. The difference between the<br />

CPU’s and the average workload therefore provides another useful monitor value.<br />

2.3.2 Actuators<br />

In order to allow the AE to effect changes <strong>in</strong> CPU operation, two actuators are provided.<br />

The first <strong>of</strong> these simply scales the frequency by a certa<strong>in</strong> value. Note that this<br />

is a relative change <strong>in</strong> frequency, i.e. the new frequency value depends on the old,<br />

which makes it easier to provide classifier rules that cover a larger range <strong>of</strong> monitor<br />

<strong>in</strong>puts. For example, with relative rules it is possible to express the statement “when<br />

utilization is high <strong>in</strong>crease the frequency” us<strong>in</strong>g a s<strong>in</strong>gle classifier rule, which would<br />

not be possible with an absolute actuator that sets the frequency to a fixed value.<br />

The second actuator triggers a task migration from the AE’s CPU to one <strong>of</strong> the<br />

other CPUs. After migration <strong>of</strong> a task, any further <strong>in</strong>terrupts to start the execution <strong>of</strong><br />

that task will be serviced by the target CPU <strong>in</strong>stead. If a task migration is triggered<br />

while the task is already runn<strong>in</strong>g, execution <strong>of</strong> the task is completed first (nonpreemptive<br />

task migration). This m<strong>in</strong>imizes the amount <strong>of</strong> state <strong>in</strong>formation that<br />

needs to be transferred from one core to another, reduc<strong>in</strong>g the performance impact <strong>of</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!