23.08.2013 Views

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

With modern applications, the streams and their encodings can be very dynamic. Smart compression,<br />

encoding and scalability features make these streams less regular then they used to be. Furthermore,<br />

these streams are typically part of a larger application. Other parts of such an application may be<br />

event-driven and interact with the stream components. Consider for example an imaginary game<br />

application shown in Figure 1, which is taken from [25]. This game includes modes of 3-dimensional<br />

game play with streaming video based modes. The rendering pipeline, used in the 3D mode, is a<br />

dynamic streaming application. Characters or objects may enter or leave the scene because of player<br />

interaction, rendering parameters may be adapted to achieve the required frame rates. Overlay graphics<br />

(for instance text or scores) may change. This happens under control of the event-driven game control<br />

logic. At the core of the application, the streaming kernels, a lot of intensive pixel based operations are<br />

required to perform the various texture mapping or video filtering operations. These operations are<br />

performed on a stream of data items. At each point in time, only a small number of data items from the<br />

stream are being processed. The order in which these data items are accessed has typically little<br />

variation. This makes it possible to optimize the memory access behaviour of the streaming kernels to<br />

achieve the required performance while minimising the energy consumption.<br />

Future embedded systems are not only characterized by their increasingly dynamic behaviour due to<br />

different operations modes that may occur at run-time, but also by their dynamism in memory<br />

requirements. Next generation embedded multimedia systems will have intensive data transfer and<br />

storage requirements. Partially these requirements will be static, but partially they will also be<br />

changing at run-time. Therefore, efficient memory management and optimization techniques are<br />

needed to increase system performance and decreased cost in energy consumption and memory<br />

footprint due to the larger memory needs of future applications. It is essential that these memory<br />

management and optimization techniques are supported by design automation tools, because due to the<br />

very complex nature of the source code, it would be impossible to apply them manually in a<br />

reasonable time frame. The automation tools should be able to optimize both statically and<br />

dynamically allocated data, in order to cope with design-time needs and adapt to run-time memory<br />

needs (scalable multimedia input changing at run-time, unpredictable network traffic, etc.) in the most<br />

efficient way. A design flow that realizes these objectives will be developed within <strong>MNEMEE</strong>. An<br />

overview of the preliminary flow is shown in Figure 2 (The final flow will be presented in D5.3). The<br />

flow takes as input the source code of an application and optimizes the memory behaviour of the<br />

application. The source code of the optimized application is the output of this flow. The first step of<br />

the flow models the source code as a task graph. The output of this step, and all other steps, is a<br />

combination of source code and models that are transferred to the next step. In this step, the task graph<br />

is mapped onto the processing elements that are available in the hardware platform. There will be two<br />

different mapping options available within the <strong>MNEMEE</strong> flow. The first option, called scenariobased,<br />

takes the dynamic behaviour of the application into account when mapping it onto the platform.<br />

The second option, called memory-aware, considers the memory hierarchy in the platform. After the<br />

mapping, the static and dynamic access behaviour of the application is optimized in the steps labelled<br />

‘parallelization implementation’ and ‘dynamic memory management’. Finally, additional memory<br />

optimizations are applied on a per processor element basis. These optimizations are based on existing<br />

single processor scratchpad optimization techniques [49].<br />

The remainder of this document presents a number of analysis approaches that can be used, within the<br />

context of the design flow, to identify memory access behaviour and dynamic behaviour in<br />

applications. Each of these analysis approaches focuses on different sources of dynamism in<br />

multimedia applications. Section 7 presents techniques to analyze and exploit the dynamically<br />

changing resource requirements of applications. These analysis techniques are part of the scenariobased<br />

approach that can be used within the ‘task graph to processing element mapping’ step of the<br />

design flow. The memory-aware approach that can also be used in this step will be developed within<br />

WP4 and described in D4.2. Section 8 presents techniques to analyze the access behaviour to statically<br />

allocated data objects. This section discusses the step labelled ‘parallelization implementation’ in the<br />

design flow of Figure 2. Section 9 focuses on the access behaviour to dynamically allocated data<br />

objects. This section discusses the details of the step labelled ‘dynamic memory management<br />

optimizations’. Finally, Section 10 concludes this deliverable.<br />

Public Page 10 of 87

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!