15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

FIGURE 18.3<br />

TABLE 18.3<br />

co-processors working in pipeline, i.e., a DCT co-processor based on an instruction set (program memory<br />

based) and a Huffman encoder based on random logic, finite state machines, one has the following results<br />

(Table 18.3, synthesized by Synopsys in 0.25 µ m TSMC process at 2.5 V) 400 images can be compressed<br />

per second with a 13 mA power consumption. At 1.05 V, 400 images can be compressed per second with<br />

a 1 mA power consumption, resulting in quite a large number of 80,000 compressed images per watt<br />

(1000 better than a programmed-based implementation).<br />

Figure 18.3 shows an interesting architecture to save power. For any application there is some control<br />

that is performed by a microcontroller (the best machine to perform control). But in most applications,<br />

there is also a main task to execute such as DSP tasks, convolutions, JPEG, or other tasks. The best<br />

architecture is to design a specific machine (co-processor) to execute such a task. So this task is executed<br />

by the smallest and the most energy efficient machine. Most of the time, both microcontroller and<br />

co-processors are not running in parallel.<br />

Low-Power Memories<br />

Memory organization is very important in systems on a chip. Generally, memories consume most of the<br />

power. So it comes immediately that memories have to be designed hierarchically. No memory technology<br />

can simultaneously maximize speed and capacity at lowest cost and power. Data for immediate use is<br />

stored in expensive registers, in cache memories, and less used data in large memories.<br />

For each application, the choice of the memory architecture is very important. One has to think of<br />

hierarchical, parallel, interleaved, and cache memories (sometimes several levels of cache) to try to find<br />

the best trade-off. The application algorithm has to be analyzed from the data point of view, the organization<br />

of the data arrays, and how to access these structured data.<br />

If a cache memory is used, it is possible, for instance, to minimize the number of cache miss while<br />

using adequate programming as well as a good data organization in the data memory. For instance, in<br />

inner loops of the program manipulating structured data, it is not equivalent to write (1) do i then do j<br />

or (2) do j then do i depending on how the data are located in the data memory.<br />

Proposing a memory-centric view (as opposed to the traditional CPU-centric view) to SoC design has<br />

become quite popular. It is certainly of technological interest. It means, for instance, that the DRAM<br />

memory is integrated on the same single chip; however, it is unclear if this integration inspires any truly<br />

© 2002 by CRC Press LLC<br />

DCT Co-processor<br />

Huffman co-processor<br />

JPEG compression<br />

Frequency and Power Consumption for a JPEG Compressor<br />

No. of Cycles 3.6 per Pixel<br />

3.8 per pixel<br />

3.8 per pixel<br />

Microcontroller and co-processor.<br />

Frequency 100 MHz<br />

130 MHz<br />

100 MHz<br />

Power 110 µ AMHz<br />

20 µ AMHz<br />

130 µ AMHz

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!