23.08.2013 Views

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

MNEMEE - Electronic Systems - Technische Universiteit Eindhoven

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Memory hierarchy re-use<br />

In addition to the copy candidates identified at branches and loops, more copy candidates can be<br />

generated by two techniques:<br />

• considering both loop-carried and non-loop-carried versions of a copy candidate at loop<br />

boundaries,<br />

considering different combinations of copy-candidates from different branches as a new copycandidate<br />

(“XOR” algorithm).<br />

These techniques are explained by examples from the MPEG-4 encoder application.<br />

In a re-use graph, the direction of the arrows between copy candidates indicates the direction in which<br />

data transfers between them have to occur. For example, consider Figure 25. The bi-directional arrow<br />

between @1 and @4 indicates that @4 has to be initialized with data from @1 first, and after it has<br />

been accessed, the contents of @4 has to be written back to @1, i.e., it has both loads and stores. The<br />

arrow between @1 and @5, on the other hand, is uni-directional, meaning that @5 only has to load<br />

data from @1, but nothing has to be written back (because @5 is only read, not written, by accesses in<br />

the application).<br />

With the simple reuse algorithm, a copy candidate at a higher level always covers all the copy<br />

candidates at the levels below it. So, in the example, only @12C would be extracted at the “loop v”<br />

level. In reality, it is not always true that such a copy candidate covering all of its children represents<br />

the best solution. Sometimes, it can be better to leave out certain children. In this case, the tool has<br />

analyzed that there is a possibly better solution at that level that doesn’t cover the write accesses,<br />

namely @14C. This is what the more complex algorithm does: it does not only consider copy<br />

candidates that are covering all of their children, but also partial combinations.<br />

Public Page 48 of 87

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!