29.01.2015 Views

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

Embedded Software for SoC - Grupo de Mecatrônica EESC/USP

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Rapid Configuration & Instruction Selection <strong>for</strong> an ASIP 413<br />

Table 30-3. Hardware cost and initial simulation.<br />

Xtensa Processor<br />

P1<br />

P2<br />

P3<br />

Area<br />

Area (gates)<br />

Power (mW)<br />

Clock rate (MHz)<br />

Simulation Time (sec)<br />

Cycle-count<br />

1.08<br />

35,000<br />

54<br />

188<br />

2,770.93<br />

422,933,748<br />

4.23<br />

160,000<br />

164<br />

158<br />

1,797.34<br />

390,019,604<br />

2.28<br />

87,000<br />

108<br />

155<br />

641.69<br />

131,798,414<br />

in<strong>for</strong>mation such as area in gates, and the speedup factor un<strong>de</strong>r each Xtensa<br />

processor are shown in Table 30-4. The down arrow in Table 30-4 represents<br />

a negative speedup, whereas the up arrow with a number, N, represents that<br />

the corresponding specific functional unit is N times faster than the software<br />

function call. Different sets of specific functional units are combined into a<br />

total of 576 different configurations representing the entire <strong>de</strong>sign space of<br />

the application. Figure 30-5 shows the verification methodology.<br />

In or<strong>de</strong>r to verify our methodology, we pre-configured the 3 Xtensa processors<br />

and the 9 specific functional units (10 TIE instructions) at the beginning<br />

of the verification. Then we simulated the 576 different configurations using<br />

the ISS. We applied our methodology (to compare with the exhaustive simulation)<br />

to obtain the result un<strong>de</strong>r an area constraint. During the first phase<br />

of the methodology, just the Xtensa processor is selected without the TIE<br />

instructions. Table 30-4 provi<strong>de</strong>s the area, cycle-count, and clock rate <strong>for</strong> an<br />

application with each Xtensa processor. This in<strong>for</strong>mation is used in equation<br />

1 of the methodology. The in<strong>for</strong>mation from Table 30-2 and Table 30-4 is<br />

retrieved from running initial simulation and verifying TIE instructions. The<br />

in<strong>for</strong>mation provi<strong>de</strong>d in Table 30-3 is then used in the selection of TIE<br />

instructions. For equation 3, all the parameters are obtained from Table 30-2,<br />

Table 30-4. Function unit’s in<strong>for</strong>mation.<br />

Function<br />

TIE instruction/s<br />

Area<br />

(gate)<br />

Speedup factor<br />

P1<br />

P2<br />

P3<br />

Latency<br />

(ns)<br />

FP Addition<br />

FP Division (1)<br />

FP Division (2)<br />

FP Multiplication<br />

Natural Logarithm<br />

Modular 3<br />

Square Root (1)<br />

Square Root (2)<br />

Square Root (3)<br />

FA32<br />

FFDIV<br />

MANT, LP24,<br />

COMB<br />

FM32<br />

FREXPLN<br />

MOD3<br />

LDEXP, FREXP<br />

LDEXP<br />

FREXP<br />

32000<br />

53800<br />

6800<br />

32000<br />

3300<br />

5500<br />

3300<br />

1100<br />

3200<br />

8.14×<br />

17.4×<br />

6.48×<br />

11. 6×<br />

10.9×<br />

8.31×<br />

3.52×<br />

8.98×<br />

15.9×<br />

5.28×<br />

l.10×<br />

17.0×<br />

3.30×<br />

2.50×<br />

1.90×<br />

8.50<br />

14.6<br />

6.80<br />

7.10<br />

5.90<br />

6.40<br />

7.00<br />

6.50<br />

6.90

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!