PPKE ITK PhD and MPhil Thesis Classes
PPKE ITK PhD and MPhil Thesis Classes
PPKE ITK PhD and MPhil Thesis Classes
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4. IMPLEMENTING A GLOBAL ANALOGIC PROGRAMMING UNIT<br />
96 FOR EMULATED DIGITAL CNN PROCESSORS ON FPGA<br />
additional Vector processor units. In this structure - somewhat similar to mixedsignal<br />
processors - the duality (mentioned in [68], in the context of analog <strong>and</strong><br />
logical computing) is expressed in the way timing <strong>and</strong> control signals are generated<br />
for all components by Controller block. The MicroBlaze core is connected<br />
across an IPIF interface to the OPB bus, which makes it possible to supply more<br />
Falcon processors <strong>and</strong> Vector processor elements in an array without any significant<br />
modifications, <strong>and</strong> perform operations simultaneously.<br />
The entire computation can be started by a single writing operation of a register<br />
element. According to the given analog or logical/arithmetical operation, the<br />
Comm<strong>and</strong> register stores the instructions for Falcon <strong>and</strong> Vector processors. The<br />
Status register shows the current state of the CNN-array. The IterLeft denotes<br />
the actual number of the iteration, while the IterCount denotes the maximal<br />
number of iterations in the picture, which can be adjusted at the beginning of<br />
the computation. The AddrIn <strong>and</strong> AddrOut counters have to store the start<br />
addresses of the ConstBRAM <strong>and</strong> StateBRAM memories. Each bit-width (e.g.,<br />
state-, constant-, template-widths) of the entire structure are configurable, thus<br />
can be adjusted to the corresponding bit-widths of the Falcon architecture for a<br />
specific application. Moreover, owing to the modular structure of the proposed<br />
GAPU if the development of a new analogical operation is required, the Comm<strong>and</strong><br />
register can be easily extended with a given instruction set of the new<br />
function.<br />
In order to utilize the high computing power of the modern FPGAs several<br />
modification need to perform on the GAPU architecture. The dedicated arithmetic<br />
units of the new generation FPGAs becomes faster, while the embedded<br />
microprocessors <strong>and</strong> the used bus systems evolves slower. The implemented Falcon<br />
PEs can work much faster than the MicroBlaze processor <strong>and</strong> its buses. I<br />
developed a new architecture in order to work the FPE, the embedded microprocessor,<br />
the controller circuit <strong>and</strong> the memory in different clock speed. This<br />
involves the modification of the StateBRAM memories. In spite of using a dualported<br />
BRAM, we need to use several single-ported ones. A few to serve the FPEs<br />
with datas <strong>and</strong> one for the MicroBlaze. In this case the relative slow MicroBlaze<br />
can monitor the state memories too, while the fast FPE can work continuously.<br />
With an additional FIFO element the StateBRAM memory can be reached from