20.11.2014 Views

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

PPKE ITK PhD and MPhil Thesis Classes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4. IMPLEMENTING A GLOBAL ANALOGIC PROGRAMMING UNIT<br />

96 FOR EMULATED DIGITAL CNN PROCESSORS ON FPGA<br />

additional Vector processor units. In this structure - somewhat similar to mixedsignal<br />

processors - the duality (mentioned in [68], in the context of analog <strong>and</strong><br />

logical computing) is expressed in the way timing <strong>and</strong> control signals are generated<br />

for all components by Controller block. The MicroBlaze core is connected<br />

across an IPIF interface to the OPB bus, which makes it possible to supply more<br />

Falcon processors <strong>and</strong> Vector processor elements in an array without any significant<br />

modifications, <strong>and</strong> perform operations simultaneously.<br />

The entire computation can be started by a single writing operation of a register<br />

element. According to the given analog or logical/arithmetical operation, the<br />

Comm<strong>and</strong> register stores the instructions for Falcon <strong>and</strong> Vector processors. The<br />

Status register shows the current state of the CNN-array. The IterLeft denotes<br />

the actual number of the iteration, while the IterCount denotes the maximal<br />

number of iterations in the picture, which can be adjusted at the beginning of<br />

the computation. The AddrIn <strong>and</strong> AddrOut counters have to store the start<br />

addresses of the ConstBRAM <strong>and</strong> StateBRAM memories. Each bit-width (e.g.,<br />

state-, constant-, template-widths) of the entire structure are configurable, thus<br />

can be adjusted to the corresponding bit-widths of the Falcon architecture for a<br />

specific application. Moreover, owing to the modular structure of the proposed<br />

GAPU if the development of a new analogical operation is required, the Comm<strong>and</strong><br />

register can be easily extended with a given instruction set of the new<br />

function.<br />

In order to utilize the high computing power of the modern FPGAs several<br />

modification need to perform on the GAPU architecture. The dedicated arithmetic<br />

units of the new generation FPGAs becomes faster, while the embedded<br />

microprocessors <strong>and</strong> the used bus systems evolves slower. The implemented Falcon<br />

PEs can work much faster than the MicroBlaze processor <strong>and</strong> its buses. I<br />

developed a new architecture in order to work the FPE, the embedded microprocessor,<br />

the controller circuit <strong>and</strong> the memory in different clock speed. This<br />

involves the modification of the StateBRAM memories. In spite of using a dualported<br />

BRAM, we need to use several single-ported ones. A few to serve the FPEs<br />

with datas <strong>and</strong> one for the MicroBlaze. In this case the relative slow MicroBlaze<br />

can monitor the state memories too, while the fast FPE can work continuously.<br />

With an additional FIFO element the StateBRAM memory can be reached from

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!