23.08.2013 Views

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

For multimedia application this is often the case, provided that the application can be analyzed and<br />

optimized sufficiently at compile time. Moreover, the cycle cost for the solution on the right can<br />

become even less than the sum of C2a and C2b if the block transfers between the two memories can<br />

be handed over to a DMA controller, which can operate in parallel with the processor.<br />

This is, in effect, the same principle as the one behind hardware caches, except that hardware caches<br />

cannot exploit application knowledge, and energy consumption in caches is typically much higher than<br />

for a simple local memory. By using software controlled local memories and application knowledge,<br />

hardware caches can be beaten both in terms of performance and in terms of energy consumption.<br />

Work flow<br />

Figure 20 - Exploiting the memory hierarchy.<br />

The proposed work flow for identifying the available re-use in an application and for assessing<br />

the potential to exploit this re-use is depicted in<br />

Figure 21.<br />

The coloured blocks suggest different tools used in particular stages. The details of the tools used in<br />

this workflow have been discussed in Section 8.4.<br />

First, the application, specified in C source code, is profiled both for array accesses and cycle counts.<br />

The application source code was produced maintaining CleanC specifications. The Atomium toolset is<br />

used for the profiling. The information is stored in a database, from which reports can be generated.<br />

This allows the designer to identify the most important data-structures and the most important code<br />

sections. The profiling information is further used to quantify the potential gains in the platformspecific<br />

part of the flow.<br />

The re-use analysis identifies potential re-use buffers. These so-called copy-candidates can be depicted<br />

in a graph for inspection by the designer. The information is also passed on to the next stage, where<br />

the most profitable set of copy-candidates is selected for a specific platform. This trade-off between<br />

different solutions is made based on the copy-candidate information, profiling data and the platform<br />

description. For each solution the cost in terms of cycle counts and power consumption can be<br />

estimated. Moreover, a schedule of the data-transfers and data-life-times can be constructed. The MH<br />

tool is used for this analysis.<br />

Finally, once a solution has been selected, the source code can be transformed to include the re-use<br />

buffers and data-transfers.<br />

Public Page 43 of 87

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!