20.11.2014 Views

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

36<br />

2. MAPPING THE NUMERICAL SIMULATIONS OF PARTIAL<br />

DIFFERENTIAL EQUATIONS<br />

computationally hard problems on different many-core architectures. Such a hard<br />

problem is the implementation of the numerical simulation of Partial Differential<br />

Equations, which are used in wide range of engineer applications. One of<br />

the toughest PDE is the Navier-Stokes equations, which describes the temporal<br />

evolution of fluids.<br />

Emulated-digital CNN are proven to be a good alternative for solving PDEs<br />

[43, 44]. In this Chapter a CNN simulation kernel is implemented on a heterogenous<br />

Cell architecture <strong>and</strong> an improved emulated-digital CNN processor is<br />

evolved from the Falcon processor on FPGA for solving PDEs. During the implementation<br />

of different PDEs on different architecture, the difference between the<br />

two platforms are investigated <strong>and</strong> their performance are compared. Two questions<br />

formed during the implementation, namely: How to map a computationally<br />

hard problem on a heterogenous processor architecture <strong>and</strong> on a reconfigurable<br />

architecture (FPGA)? What is the difference in performance between the inhomogenous<br />

<strong>and</strong> the custom architecture? In the next few section the answer <strong>and</strong><br />

the method for its investigation are going to be described.<br />

2.1.1 How to map CNN array to Cell processor array?<br />

The primary goal is to get an efficient CNN [17] implementation on the Cell<br />

architecture. Because analog CNN architectures are effective solving partial differential<br />

equations. The analog CNN chips has limitations (limited precision, only<br />

linear templates, sensitive to the environmental noises) <strong>and</strong> that is why it can not<br />

be used in real life applications. A SIMD (Single Instruction Multiple Date) architecture<br />

can be implemented efficiently in CNN. Consider the CNN model <strong>and</strong><br />

its hardware effective discretization in time. With the emulated digital CNN-UM<br />

these drawbacks can be neglected.<br />

2.1.1.1 Linear Dynamics<br />

The computation of the discretized version of the original CNN state equations<br />

(1.5a) <strong>and</strong> (1.5b) on conventional CISC processors is rather simple. The appropriate<br />

elements of the state window <strong>and</strong> the template are multiplied <strong>and</strong> the<br />

results are summed. Due to the small number of registers on these architectures,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!