20.11.2014 Views

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.7 Results 105<br />

Table 4.1: Comparison of a modified Falcon PE <strong>and</strong> the proposed GAPU in<br />

terms of device utilization <strong>and</strong> achievable clock frequency. The configuration of<br />

state width is 18 bit. (The asterisk denotes that in Virtex-5 FPGAs, the slices<br />

differently organized, <strong>and</strong> they contain twice as much LUTs <strong>and</strong> FlipFlops as the<br />

previous generations).<br />

Device<br />

utilization<br />

<strong>and</strong> speed<br />

Num. of<br />

occupied slices<br />

Num. of<br />

BRAMs<br />

Num. of<br />

MULT18×18s<br />

Core<br />

frequency [MHz]<br />

MicroBlaze<br />

GAPU<br />

(@18 bit)<br />

Falcon PE<br />

(@18 bit)<br />

Available on<br />

XC2V3000<br />

Available on<br />

XC6VSX475T<br />

1780 452 14 336 74400*<br />

18 5 96 2128<br />

0 9 96 2016<br />

100 133 133 600 (210)<br />

the currently available largest DSP-specialized Virtex-6 SX475T FPGA has to<br />

be used where the GAPU occupies only a minimal additional area (about 2.4%<br />

of slices <strong>and</strong> only 1% of BRAMs). With the use of this device, 160 Falcon PE<br />

cores can be implemented in an array, as well. The other large-sized Virtex-<br />

II Pro or Virtex4 FX or Virtex5 FX platform FPGAs provide embedded IBM<br />

PowerPC405 hard processor core(s) at a higher speed, which may be a good<br />

alternative to implement GAPU in further development. There are only rumors<br />

about the newest 7th series Xilinx FPGAs, which will embed ARM processor<br />

core.<br />

4.7 Results<br />

Considering consecutive analog operations (e.g., the black <strong>and</strong> white skeletonization<br />

above) on a 64 × 64 image with 18-bit state-, constant- <strong>and</strong> 9-bit templateprecision,<br />

the Falcon PE cores perform 10 iterations within 0.307 ms. Without the<br />

proposed GAPU extension, due to the slow communication via the parallel port<br />

(downloading / uploading the sequence of instructions, templates, <strong>and</strong> results),<br />

the data transfer requires approximately 204.8 ms, while the full computing time

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!