20.11.2014 Views

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.2 Computational background <strong>and</strong> the optimized Falcon architecture 89<br />

with the sequence of analogic instructions (the program) onto the FPGA, through<br />

merely asserting a ’start’ signal. All the above may be done either without a host<br />

PC, thus our architecture is capable of achieving st<strong>and</strong>-alone operation, which is<br />

desirable in many industrial contexts. The complex analogic algorithms require<br />

classical program organization elements, i.e., sequential-, iterative- <strong>and</strong> conditional<br />

execution of instructions. Consequently, the embedded GAPU must be<br />

supplied to the Falcon architecture to extend it to a fully functional CNN-UM<br />

implementation on a reconfigurable FPGA.<br />

4.2 Computational background <strong>and</strong> the optimized<br />

Falcon architecture<br />

The Falcon architecture implemented on reconfigurable FPGA iterates the forward-<br />

Euler discretized CNN equation by using FSR (Full-Signal Range) model, which<br />

is derived from the original Chua-Yang model [68]. The equations for the Euler<br />

method are as follows:<br />

x m,ij (n + 1) = x m,ij (n) +<br />

g m,ij =<br />

p∑<br />

∑<br />

n=1 kl∈S r(ij)<br />

p∑<br />

∑<br />

n=1 kl∈S r(ij)<br />

A ′ mn,ij,kl · x n,kl (n) + g m,ij (4.1)<br />

B ′ mn,ij,kl · u n,kl (n) + h · z m,ij , (4.2)<br />

where the number of layers is denoted by p, the state of the cell is equal to its<br />

output <strong>and</strong> limited in the [-1, +1] range. It contains processing elements in a<br />

square grid, <strong>and</strong> the time-step value h is inserted into the A’ <strong>and</strong> B’ template<br />

matrices. Moreover, supposing that input is constant or changing slowly, g ij can<br />

be treated as constant <strong>and</strong> should be calculated only once at the beginning of the<br />

computation.<br />

This multi-layer extension of the elaborated Falcon architecture based on<br />

FPGA [45] can be used for emulating a fully connected multi-layer CNN structure:<br />

both the number of layers <strong>and</strong> the computing accuracy are configurable.<br />

The main blocks of the Falcon architecture are extended with a low-level Control<br />

unit, which is optimized for GAPU extension, <strong>and</strong> they are shown in Figure 4.1.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!