A Real-time Image Feature Extraction Using Pulse-Coupled Neural ...

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856A **Real**-**time** **Image** **Feature** **Extraction** **Using****Pulse**-**Coupled** **Neural** NetworkTrong-Thuc Hoang 1 , Ngoc-Hung Nguyen 2 , Xuan-Thuan Nguyen 3 and Trong-Tu Bui 41,2,3,4The University of Science, VNU-HCM, Faculty of Electronics and Telecommunications,227 Nguyen Van Cu St., Dist. 5, Ho Chi Minh City, VietnamAbstract: In this paper, we present two hardwarearchitectures for an image feature vector extraction based onthe **Pulse**-**Coupled** **Neural** Network (PCNN) algorithm. Theyare RAM-based model and pipelined model. Both models cangenerate a feature vector at the speed of more than 2000vectors per second when using a clock frequency of 50 MHzand input image size of 128x128 pixels. Based on thesearchitectures, a demonstration recognition system including acamera, a feature vector generator, a search engine and aDVI controller has been built and tested successfully onFPGA chips in order to verify the operation of the algorithmand the architectures.Keywords: feature vector, hardware implementation,object recognition, PCNN, **Pulse**-**Coupled** **Neural**Network, real-**time**.1. INTRODUCTION**Pulse**-**Coupled** **Neural** Network (PCNN) is a biologicallyinspired neural network based on cat’s visual corticalneurons. The significant advantage of the PCNN model isthat it can operate without any training needed. Sinceintroduced by Eckhon in 1990 [1], the model has provenits vital role in digital image processing, such as imagesegmentation [2], image thinning [3], motion detection[4], pattern recognition [5], face detection [6], etc. In thecomparison with other image processing algorithms, thePCNN algorithm is anti-noise and robust against thetranslation, scale, and rotation of the input patterns [7].The pattern-matching function plays an important role invarious information processing systems. In order toimplement such functions effectively, both proper datarepresentation algorithms and powerful search enginesare essential. Concerning the former, in this paper wepresent a design of a feature vector generator employingthe PCNN algorithm. Many techniques and algorithmshave been developed for image feature extraction usingICA [8], PCA [9], Haar transformation [10] etc. Thosefeature extraction methods are effective to extract featuresof simple images, such as simple shapes and letters.However, they are sensitive to noise and not invariantagainst the geometric changes. **Feature** extraction processis **time** consuming, thus software implementation is notsuitable for real-**time** applications. Hardwareimplementation is a solution to overcome such an issue.In the literature, several analog VLSI circuits thatimplement the PCNN have been presented [11]. Althoughcompact they suffer from accuracy as well as processvariations and device mismatches which aredisadvantages of analog circuits. Digital implementationshave also proposed, Javier et al [12] presented an FPGAsystem that can operate at high speed, but it is onlysuitable for those applications that not require theiterative operation of the PCNN. In an image featureextraction system, it requires the iterative processing ofthe PCNN model in order to generate a feature vector.Ranganath et al [13] proposed such a hardware design,but it needs multiple PCNN layers and dynamicparameters.In this paper, we present two digital designs based on thePCNN algorithm which can be applied into a real-**time**object-recognition system. They are RAM-based modeland pipelined model. The RAM-based model utilizesRAMs to store the network's information in order toreduce the LUTs and registers costs. In the other hand,the pipelined model uses only the registers, thereby givingrise to a convenient implement in ASIC. The pipelinedarchitecture divides the operation of the PCNN algorithminto stages. Each stage generates one element of a featurevector. The resource costs of RAM-based model dependon the input image size and that of the pipelined modelmainly depend on the wanted element number of anoutput feature vector. Both models can generate a featurevector at the speed of more than 2000 vectors per secondwhen using a clock frequency of 50 MHz and image sizeof 128x128 pixels. Based on these architectures, ademonstration of object-recognition system has been built,including a camera, a feature vector generator, a searchengine, and a DVI controller. The system can recognizean object in the input image based on the feature vectorgenerated by the feature vector generator employingPCNN algorithm.The remainder of this paper is organized as follows.Section 2 briefly reviews the PCNN model and gives animplementation of a neuron. Section 3 describes a featurevector generation and its application in an objectrecognition system. Section 4 proposes two PCNNhardware architectures with their experimental results.Section 5 describes a completed object-recognition systemand its performance. Section 6 gives the conclusion.Volume 1, Issue 3, September – October 2012 Page 177

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856Figure 1 A PCNN neuron structure2. THE PCNN MODEL2.1 Description of the ModelThe PCNN is a two-dimensional neural network with asingle layer. Each network neuron corresponds to aninput image pixel. Because of this, the structure of thePCNN comes out from the structure of input image.The PCNN's neuron structure is shown in Fig. 1 [14].There are three parts that form a neuron: a feeding field(receptive field or dendritic tree in some references), alinking modulation, and a pulse generator. The neuron'soperation is described as iteration by the followingequations.F [ n] S F [ n 1] e F V M Y [ n 1] (1)ij ij ij F ijkl klklL [ n] L [ n 1] e L V W Y [ n 1] (2)ij ij L ijkl klklUij[ n] Fij [ n](1 Lij[ n ])(3)T [ n] T [ n 1] e T V Y [ n 1](4)ij ij T ijY [ n]ij1 U [ n] T [ n]0ijijOtherwiseIn these equations, (i, j) is the position of a neuron in thenetwork and a pixel in the input image. For example, ifthe input image size is 128x128 pixels, the range ofindexes (i,j) will between (1,1) and (128,128). (k,l)represent for the positions of the surrounding neurons; nis the iterative step number; and S ij is the gray level of theinput image pixel. F ij [n], L ij [n], U ij [n], and T ij [n] is thefeeding input, linking input, internal activities, anddynamic threshold, respectively. The pulse output of aneuron, Y ij [n], is the binary value which indicates thestatus of the neuron. Y ij [n] equals to 1 means the neuronis activated.M and W are the constant synaptic weight matrices anddepended on the field of the surrounding neurons, i.e. thelinking field. For example, if each neuron is onlyconnected by eight neurons surrounding it, the linkingfield is a 3x3 matrix. As a result, M and W are 3x3matrices, too. M and W refer to the Gaussian weightfunctions with distance and they are usually chosen as the(5)same. is the linking coefficient constant; F, L, and Tare the attenuation **time** constants; and V F , V L , and V T arethe inherent voltage potential of the feeding signal,linking signal, and dynamic threshold, respectively.In conclusion, the operation of a neuron at step ndepended on its corresponding pixel value (i.e. S ij ), thepulses output from the surrounding neurons at theprevious step (i.e. Y kl [n-1]), and the internal signals of theneuron itself (i.e. F ij [n-1], L ij [n-1], U ij [n-1], and T ij [n-1]). All of the network pulses Y ij [n] form a binary imagewhich contains important information such as regionalinformation, edge information, and features of an objectin the input image.2.2A Neuron ImplementationBased on many references practical experiments, Tab. 1shows the PCNN parameters which were selected for thefeature extraction purpose. The top-view of hardwareimplementation for a neuron is shown in Fig. 2. iS_data,iY_data, iF_data, iL_data, and iT_data in the figure arerepresent for S ij , Y kl [n-1], F ij [n-1], L ij [n-1], and T ij [n-1]in the theory, respectively. They are the required input forthe neuron operation. oY_data, oF_data, oL_data, andoT_data are Y ij [n], F ij [n], L ij [n], and T ij [n] in the model,respectively. The input pixel value is 8-bit gray-scale andcarried by the iS_data signal. iY_data signal has thewidth of nine because the linking field is chosen as a 3x3matrix. oY_data signal is the pulse output with binaryvalue, thus it is a single net. With the PCNN parameterswere fixed, the widths of the other signals are calculatedlogically.Figure 2 Top-view of a neuron implementationFigure 3 A PCNN neuron implementationTable 1: The PCNN parametersParameterF L TVFValue 0.5 1 1 0.02 0.02 10 0.2VLVTVolume 1, Issue 3, September – October 2012 Page 178

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856The detail of a PCNN neuron implementation is shown inFig. 3. As mentioned above, the linking field is chosen asa 3x3 matrix leads to a synaptic weight matrix is selectedas can be seen in the figure. With the chosen parameters,the complex factor in the model such as exp(- F ) can beconsidered as a simple factor F. As a result, there is noreal exponential functions are employed but themultiplication ones. The multiplier parameters F, L, andT represent for exp(- F ), exp(- L ), and exp(- T ) in theequations, respectively. Furthermore, a CSD (i.e.canonical signed digit) multipliers method [15] was usedto transform a fixed-point constant multiplying into anefficient "shift-and-add/subtract" algorithm.3. FEATURE EXTRACTION USING PCNN3.1. **Feature** VectorThe feature vector is called as the **time** signature andproposed by Johnson [16]. As mentioned above, theoutput binary image which is formed by the networkpulses Y ij [n] contains much important information. Ineach iterative processing step n, the PCNN produces adifferent output binary image. And a feature vectorelement G[n] is generated by summing the pixel values inthat output binary image as shown in the followingequation.G[ n] Y [ n ](6)i,ji,jThe length of the feature vector is defined as the totalnumber of the PCNN iterative steps. For example, ifPCNN iterates it’s processing N **time**s, the feature vectorwill have the length of N elements. Thus, the range of n isbetween 1 and N values. The length of the vector Grepresents for the quality of the vector itself. In order togenerate a high quality feature vector, the larger N isneeded. As a consequence, it takes a long **time** for theoperation to generate one feature vector. In this paper, theN number is chosen as 32 to implement the architectures.3.2. **Using** **Feature** Vector for Object RecognitionEq. 6 reduces the feature data from two-dimensional data(i.e. binary images) to one-dimensional data (i.e. featurevector), which is very suitable for a pattern recognitionsystem such as object recognition. Different input imagesgenerate different feature vectors, with the result that afeature vector is unique for the input image. However, iftwo images contain similar objects, their feature vectorswill be likely the same. Based on that concept, the featurevector which is generated by the PCNN algorithm can beused to recognize an object in the input image. Johnson[16] proves that the feature vector is anti-noise andinvariant against the geometrical changes. For thisreason, the PCNN model has strong advantages in thecomparison with other object recognition algorithms. Atypical object recognition system employing PCNN isshown in Fig. 4.The feature vector generator employs the PCNN to extractfeatures from the input image and produce a vector G.Then, the search engine compares the vector with samplevectors in the templates in order to identify the object.The templates contain many sample feature vectors, andeach vector represent for one object.3.3. The Comparison between **Feature** VectorsThe MSE (i.e. mean square error) method is used toquantitative the differences between two feature vectors.The MSE value is calculated by Eq. 7. The small MSEvalue means two vectors are similar and the object can berecognized directly.Figure 4 A typical recognition system using PCNNN1MSE [ En( p) E ( p)]Nn 1' 2nwhere, n and N are the iterative step and the featurevector's length, respectively. E n (p) and E n ’ (p) are twoentropy vectors which are transformed from the featurevectors by the following equations.G[ n]p1[ n]PHVG[ n]p0[ n] 1PnHV1 2 1 0 2 0(7)(8)(9)E ( p) p [ n]log p [ n] p [ n]log p [ n ] (10)p 1 [n] and p 0 [n] represent for probabilities when Y ij [n] = 1and Y ij [n] = 0, respectively. P HV is the input image sizewhich equals to the total neurons in the network. Forexample, if the input image size is 128x128 pixels, P HVequals to the constant value of 16384.4. PROPOSED HARDWARE ARCHITECTURESBoth RAM-based and pipelined models use the neuronimplementation in sec. 2.2 as groundwork to build thearchitectures. The input image size is 128x128 pixelswith 8-bit gray-scale level. The feature vector's length ischosen as 32 elements.4.1. RAM-based PCNN Architecture4.1.1. IdeaAs mentioned above, the PCNN model is a twodimensionalnetwork with a 1:1 corresponding betweenimage pixels and network neurons. Hence, the originalVolume 1, Issue 3, September – October 2012 Page 179

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856PCNN implementation is formed by the P HV neurons,where P HV is the total pixels in the image. The idea of thisarchitecture is to use a few neurons for the computationand RAMs for storing the network information such asthe input, output, and variables values. As a result,instead of implementing P HV neurons, the RAM-basedmodel can use one neuron to do the feature vectorgeneration.4.1.2. ImplementationThe top-view of the RAM-based PCNN module is shownin Fig. 5. First of all, the module is allowed to runwhenever the iStart signal is asserted for one clock cycle.Then, the input image is written into the module by twosignals: iWrite and iWrite_data. Later, the modulegenerates a feature vector by transmitting one vector'selement at a **time** using oG_valid and oG_data signals.Finally, the oDone signal is asserted for one clock cyclewhenever 32 elements of the feature vector have beentransmitted.In the LP, the Block Neuron, Counter G, and theController do the PCNN computation, sum the output togenerate a vector, and control the operation, respectively.The Controller controls three RAMs address while RAMsdata are connected to the Block Neuron. The Counter Gsum the output pulses Y ij [n] from the Block Neuron tocomputes G[n]. The Block Neuron is a group of a fewneurons which its implementation is described in sec. 2.2.The signals' waveform of the operation is shown in Fig. 7.S ij , where i and j from 1 to 128, is the input pixel value at(i,j) position. A ij is the RAMs' addresses which store theinformation of the neuron (i,j). R ij and W ij are the neuron(i,j) information including pixel value S ij , output pulse Y ij ,feeding F ij , linking L ij , and threshold T ij . R ij is the readdata from previous step (i.e. step n-1), while W ij is thewrite data in the current step (i.e. step n). G[n] is thefeature vector element generated in step n.Figure 5 Top-view of the RAM-based PCNN moduleFigure 6 RAM-based PCNN block diagramThe block diagram of the architecture is shown in Fig. 6.The system is composed of two parts: the memory part(MP) and the logic part (LP). MP contains three RAMs tostores the input image Sij (RAM_S), the internalvariables such as Fij[n], Lij[n], and Tij[n] (RAM_FLT),and the output binary image Yij[n] (RAM_Y). EachRAM has the same memory cells (i.e. PHV) but their cellwidth because the widths of their stored signals aredifferent. RAM_S, RAM_FLT, and RAM_Y have 8 bits,28 bits and 1 bit per memory cell, respectively.Consequently, the total RAMs' capacity come from Eq.11. It is clear that the RAM resources cost in the modeldepends on the input image size only.B ( bit) 37P (11)RAMHVFirstly, the Controller waits for the assertion of the inputiStart signal to begin the operation at (A). After this,between (B) and (C), the input image is written toRAM_S. At (C), when the final pixel value S 128,128 hasbeen stored, the Controller reads the image from RAMsby asserting the RAMs' read enables and RAMs' readaddresses. Then, the read data R ij , also the BlockNeuron's input, are transferred by the three RAMs. At(D), the Controller asserts three RAMs' write enables andRAMs' write addresses when the write data W ij , also theBlock Neuron's output, is valid. Between (C) and (F), theprevious step data (i.e. step n-1) are read and processed inthe Block Neuron, then the new data are written back intoRAMs as the current step data (i.e. step n). After theController finishes the reading at (E), there are two moreclock cycles to finish the writing at (F). The oG_validsignal is asserted for one clock cycle at (F) to give thefeature vector element G[1] on the oG_data signal. Theprocess of generating the G[1] value is done at (G).Simultaneously, the new process of generating the G[2]value is begun. The process between (C) and (G) isrepeated for more 31 **time**s in order to generate a fulllengthfeature vector with 32 elements. As soon as the lastfeature vector element G[32] is valid at (H), the oDonesignal is asserted for one clock cycle as same as theoG_valid signal. Finally, the operation is finished at (I),and a 32-element feature vector is already generated.Then, the RAM-based PCNN module returns to the initialstate at (A).4.1.3. Speed optimizationVolume 1, Issue 3, September – October 2012 Page 180

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856The Block Neuron contains one neuron implementation,as a result of which there are P HV + 3 clock cyclesbetween (C) and (G) in Fig. 7 for an element generation.Because of this, the total clock cycles for 32-elementfeature vector generation are numerous. In order toprocess in real-**time**, the Block Neuron contains a groupof neurons to speed up the operation. For example, theBlock Neuron implements a block of 2x2 neurons, the LPdo not process one neuron per clock cycle but a group of 4neurons instead. Then, the number of clock cyclesbetween (C) and (E) is a quarter of P HV . Therefore, theoperation **time** is reduced approximate four **time**s.Because the LP processes a block of 2x2 neurons perclock cycle, each memory cell in RAMs has to store 4neurons' information. This leads to the wider cell widthbut fewer total memory cells. As a result, the total RAMs'capacity in Eq. 11 remain constant. For fasterimplementation, the Block Neuron not only contains ablock of 2x2 neurons but also a block of 4x4 neurons, 8x8neurons, etc. In a word, there are multiple choices ofimplementing the RAM-based PCNN model and itdepends on the speed requirement.4.2 Pipelined PCNN Architecture4.2.1 IdeaThe pipelined model is formed by N stages as shown inFig. 8, where N is the feature vector's length. Each stagereceives the data from the previous stage, generates onefeature vector's element, and transmits its data to the nextstage. Only the registers are used in the model forbuffering the data between stages.4.2.2 ImplementationThe top-view of the pipelined PCNN module is shown inFig. 9. iData_valid and iData signals carry the 8-bit grayscaleinput image to the first stage. oG_valid and oG_datasignals transmit the feature vector. Generating the featurevector is done whenever the oDone signal is asserted.The top-view of a stage is shown in Fig. 10. WheneveriValid signal is asserted, a stage receives one neuroninformation from the previous stage including pixelvalues S ij (iData_S), feeding input F ij [n-1] (iData_F),linking input L ij [n-1] (iData_L), threshold T ij [n-1](iData_T), and pulse Y ij [n-1] (iY_data). Correspondingly,a set of signals such as oValid signal and output datasignals transfer the stage information to the next stage.As soon as a stage has finished its work, it asserts theoDone signal and transmits the feature vector element onthe oG signal.Figure 7 The pipelined PCNN architecture with N stagesFigure 8 Top-view of the pipelined PCNN moduleFigure 9 Top-view of a stageIn order to process the neuron (i,j) in the step n, itrequires the corresponding pixel value (i.e. S ij ), step n-1internal activities (i.e. F ij [n-1], L ij [n-1], Tij[n-1]), and alinking field of 3x3 neurons (i.e. Y i-1,j-1 [n-1], Y i-1,j [n-1],Y i-1,j+1 [n-1], Y i,j-1 [n-1], Y i,j [n-1], Y i,j+1 [n-1], Y i+1.j-1 [n-1],Y i+1,j [n-1], Y i+1.j+1 [n-1]). A stage processing is startedwith the neuron (1,1) and finished with the neuron (H,V),where H and V are the total column number and rownumber, respectively. With the input image size of128x128 pixels, both H and V are equal to 128. Theneuron (1,1) is at the corner of the network which has thelinking field of 2x2 neurons (i.e. Y 1,1 [n-1], Y 1,2 [n-1],Y 2,1 [n-1], Y 2,2 [n-1]), with the result that the stage n has towait for the neuron (2,2) information from the stage n-1in order to start. Consequently, the stage n processing canbe started only when the stage n-1 has processed H+2neurons. Then, N stages’ processing **time**s are alternatedas shown in Fig. 11.The stage 1 starts first. After the stage 1 has processedH+2 neurons, the stage 2 has enough information to start.Then, the stage 3 starts only when the stage 2 hasprocessed H+2 neurons, and so on. The stage 1 finishesand produces G[1] value when it processed the lastneuron, (H,V) neuron. This means that the stage 1 hasprocessed P HV neurons in total, where P HV is the totalneurons in the network. When the stage 1 is done, thestage 2 has the last H+2 neurons to process. For thisreason, the stage 2 is done after the stage 1 H+2 neuronprocessing **time**s. Then, stage 3 is done after the stage 2H+2 neuron processing **time**s, and so on. The operation isdone when the last stage is done, and one 32-elementfeature vector is successfully generated.It is clear that a stage needs buffers to store (H+2) neuroninformation. The stage design is shown in Fig. 12. Thereare five buffers in a stage to store pixel value S ij (Sbuffer), feeding input F ij [n-1] (F buffer), linking inputL ij [n-1] (L buffer), threshold T ij [n-1] (T buffer), and 3x3matrix linking field Y ijkl [n-1] (Y buffer). As mentionedVolume 1, Issue 3, September – October 2012 Page 181

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856above, buffers are constructed by registers only. The pixelvalues buffers and internal variables buffers utilize (H+2)registers for each buffer. The linking field is 3x3 matrix,with the result that three lines buffers are required as canbe seen in the figure. The data widths of registers dependon the type of the signal which they carry. Hence, thetotal buffers capacity in one stage can be calculated in Eq.12.BStage( bit) 39h 72(12)Besides the buffers, the stage contains one PCNN neuron,a Counter, and a Controller for the neuron computation,sum the output to produce an element value, and controlthe stage's activities, respectively.4.2.3 Area optimizationTable 2: Hardware usage and processing **time** results ofthe two PCNN architecturesThe first and the last stage are designed differently in thecomparison with the design in Fig. 10. Owing to neuron'sinitial values are zeros, stage 1 needs only the input pixelvalues to operate. Therefore, there are no buffers in thefirst stage. In addition, the input signals of the first stageare iData_S and iValid signals only. The last stage'soutput signals are the oY_data only because it does nottransmit the stage data. The result of stage 1 needs onlythe pixel value is to the buffers reduction in that stage.Therefore, The entire buffers capacity in the pipelinedPCNN module can be calculated in Eq. 13.B ( bit) ( N 1) B (13)PipelinedStageFigure 10 The processing **time**s of N stages are alternatedFigure 11 A stage designFigure 12 The comparison between vectors4.3 Experimental ResultsBoth models have approximate processing speed but theresources cost. The differences between two models'resources costs come from Eq. 11 and Eq. 13. As can beseen in two equations, the resources cost of the RAMbasedmodel depends on the input image size, P HV , andthat of the pipelined model mainly depends on the featurevector length, N. Based on the specific requirement whichrequires the larger input image size or more feature vectorelements, the better model can be selected.Two models are verified by the Modelsim simulator andimplemented on Altera Stratix III board with anEP3SL150F1152C2 FPGA chip. The input image sizeand the feature vector length are chosen as 128x128pixels and 32 elements, respectively. Tab. 2 shows theresources utilization and the performance of twoarchitectures. As mentioned above, the RAM-based modelhas multiple choices for implementation such as oneneuron, block 2x2 neurons, block 4x4 neurons, and block8x8 neurons with the different resources usage. The lastrow in the table shows the required processing **time** whichis needed for one 32-element feature vector generation ineach model. For example, as the pipelined model needswhen using amaximum clock frequency of 70.61 MHz, the modelachieves the speed of approximate 3460 vectors persecond. Tab. 2 shows that the RAM-based model withlarger number of neurons produces vector faster but costsmore ALUTs and registers. As mentioned above, the totalRAM capacity in the RAM-based model depends on theinput image size only, although the model implementsmore neurons in the Block Neuron. As a result, theVolume 1, Issue 3, September – October 2012 Page 182

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856memory costs of the RAM-based model in various optionsremain constant as can be seen in the table. The pipelinedmodel does not implement RAMs causes a zero value inthe memory costs. However, the register costs arenumerous.Fig. 13 shows the comparison between the three featurevectors: the ideal vector is calculated by Matlab andothers are obtained from the hardware designs. It is clearthat both models generate feature vectors which aresimilar to the ideal one. As mentioned above, the featurevector is anti-noise and invariant against the geometricalchanges. This is proved in Fig. 14. The Lena images arenoise (5%), and 300 angle rotation, but their featurevectors are similar as can be seen in the figure. Thefeature vectors which are used in the example aregenerated by the pipelined PCNN architecture. Inconclusion, the proposed implementations generate thefeature vectors which are similar to the ideal vector andstrongly robust against noise and geometrical changes inthe input image.stored in the DDR2 RAM by the DDR2 Controller. The**Feature** Vector Generation which is formed by one of twoproposed architectures processes on the reduced image inorder to generate a 32-element feature vector. Then, theSearch Engine identifies an object by comparing thereceived feature vector with the samples. Each of thefeature vector samples which are stored in the SamplesRAM represents for a specific object such as a computermouse, a car, a chair, a clock, etc. As soon as the SearchEngine had finished its work, it writes the result and theinput feature vector into the Result RAM. Finally, theDVI controller reads the original image from the off-chipDDR2 RAM and the feature vector from the Result RAMthen displays them both on the monitor.Figure 14 The object-recognition system block diagramFigure 13 The variation of the Lena images and theirfeature vectors5. AN OBJECT-RECOGNITION SYSTEM USINGPCNN ALGORITHM5.1. System OverviewIn order to verify the operation of the models, a completedsystem has been built including a camera, an FPGAboard, an off-chip DDR2 RAM, and a monitor. The blockdiagram of the object-recognition system is shown in Fig.15. The Camera transmits the image which has the fullHD resolution (i.e. 1920x1080 pixels) and 24-bit RGBcolor. The Camera Controller receives an image thentransfers it to the **Image** Pre-processing and the off-chipDDR2 RAM. The **Image** Pre-processing reduces the inputmage to 128x128 pixels with 8-bit gray-scale level. Onthe other hand, the original image with full resolution isFigure 15 The MSE values between the input image andthe sample imagesFigure 16 The completed system5.2 Object-recognition System Experimental ResultsVolume 1, Issue 3, September – October 2012 Page 183

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856As mentioned above, each input image has a uniquefeature vector which represents for the object in thatimage. However, images contain similar objects willgenerate similar vectors. The comparison between vectorsis done by the MSE value which is presented in Sec. 3.3.With an input image from the camera, the MSE valuescan be calculated for each pair of the feature vectors. Fig.16 shows an input image and templates with 7 sampleimages. A MSE value below a sample image is obtainedby the comparison between the feature vector of thatsample image and the feature vector generated from theinput image. If the input image contains an object whichis similar with any sample objects in the templates, theirMSE value will smallest. Therefore, the systemrecognizes an object by finding the smallest MSE valuevia a winner-take-all circuit. It can be seen in the figurethat the MSE value between the input image and thecomputer mouse sample image is the smallest one. Then,the system recognizes the input object as a computermouse.A completed system has been built and tested successfullyon an Altera Stratix III board with an EP3SL150F1152C2FPGA chip. As can be seen in Fig. 17, the objectrecognitionsystem is processing on a computer mouseimage which is recorded by camera. The DVI porttransmits the image and the feature vector from the boardto the monitor. The feature vector which represents forthe computer mouse is on the top-left corner of themonitor. With the speed of more than 2000 vectors persecond, the system is suitable for real-**time** applications.6. CONCLUSIONWe have introduced two real-**time** hardwarearchitectures for a feature vector generation based on thePCNN algorithm, and applied them to an objectrecognitionsystem in order to verify the operation themodels. They are RAM-based model and pipelinedmodel. As the input image size is 128x128 pixels and thefeature vector's length is 32 elements, both models cangenerate a feature vector at the speed of more than 2000vectors per second when using a clock frequency of 50MHz. Additionally, the speed of the models is about 106vectors per second with the image size of VGA resolution(i.e. 640x480 pixels) and the same feature vector's length.Two architectures have been implemented and testedsuccessfully on the Altera Stratix III board with anEP3SL150F1152C2 FPGA chip. The resources costs ofthe RAM-based model and the pipelined model primarydepend on the input image size and the feature vector'slength, respectively. Based on the specific requirementwhich requires the larger input image size or more featurevector elements, the better model can be selected. Theexperimental results of two architectures show thesignificant advantages of the PCNN model such as antinoiseand invariant against the geometrical changes inthe input image. Besides, the PCNN algorithm has thestrong advantage which does not require any training inorder to operate. The PCNN model has been proved thatit can be used as a feature vector generator in such apattern recognition system.AcknowledgementsThis work was supported by the University of Science,Vietnam National University – Hochiminh City (VNU-HCMUS) under the fund No. B2011-18-04.References[1] R. Eckhorn, H.J. Reitboeck, M. Arndt, P.W. Dicke,"**Feature** linking via synchronization amongdistributed assemblies: Simulation of results fromcat cortex, **Neural** Computation," **Neural**Computation, Vol.2, No.3, pp. 293-307, fall 1990.[2] Henrik Berg, Roland Olsson, Thomas Lindblad, JoséChilo, "Automatic design of pulse coupled neuronsfor image segmentation," Neurocomputing, Vol.71,Issues 10-12, pp. 1980-1993, June 2008.[3] Lifeng Shang, Zhang Yi, "A class of binary imagesthinning using two PCNNs," Neurocomputing,Vol.70, Issues 4-6, pp. 1096-1101, Jan. 2007.[4] Jun Chen, Kosei Ishimura, Mitsuo Wada, “MovingObject **Extraction** **Using** Multi-Tiered **Pulse**-**Coupled****Neural** Network,” In Proc. SICE Annual Conf.,Vol.3, pp. 2843-2848, Aug. 2004.[5] Raul C. Muresan, "Pattern recognition using pulsecoupledneural networks and discrete Fouriertransforms," Neurocomputing, Vol.51, pp. 487-493,April 2003.[6] Hitoshi Yamada, Yuuki Ogawa, Kosei Ishimura,Mitsuo Wada, “Face Detection using **Pulse**-**Coupled****Neural** Network,” In Proc. SICE Annual Conf.,Vol.3, pp. 2784-2788, Aug. 2003.[7] John L. Johnson, "**Pulse**-coupled neural nets:translation rotation scale distortion and intensitysignal invariance for images," Applied Optics, Vol.33, Issue 26, pp. 6239-6253, 1994.[8] Seiichi Ozawa, Yoshinori Sakaguchi, ManabuKotani, "A study of feature extraction usingsupervised independent component analysis," InProc. Int. Joint Conf. on **Neural** Networks, Vol. 4,pp. 2958-2963, 2001.[9] Nan Liu, Han Wang, "**Feature** **Extraction** **Using**Evolutionary Weighted Principal ComponentAnalysis," In IEEE Int. Conf. on Man andCybernetics Systems, Vol. 1, pp. 346-350, Oct.2005.[10] Yanwei Pang, Xuelong Li, Yuan Yuan, DachengTao, Jing Pan, "Fast Haar Transform Based **Feature****Extraction** for Face Representation andRecognition," IEEE Trans. on Information Forensicsand Security, Vol. 4, Issue 3, pp. 441-450, Sep.2009.[11] Jun Chen, Tadashi Shibata, "A Neuron-MOS-BasedVLSI Implementation of **Pulse**-**Coupled** **Neural**Networks for **Image** **Feature** Generation," IEEEVolume 1, Issue 3, September – October 2012 Page 184

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.comVolume 1, Issue 3, September – October 2012 ISSN 2278-6856Trans. on Circuits and System, Vol. 57, Issue 6, pp.1143-1153, June 2010.[12] Javier Vega-Pineda, Mario I. Chacón-Murguía,Roberto Camarillo-Cisneros, “Synthesis of **Pulse**-**Coupled** **Neural** Networks in FPGA for **Real**-**time****Image** Segmentation,” In Proc. Int. Joint Conf.**Neural** Networks, pp. 4051-4055, July 2006.[13] H.S. Ranganath, G. Kuntimad, "Object Detection**Using** **Pulse** **Coupled** **Neural** Networks," IEEETrans. **Neural** Networks, Vol.10, Issue 3, pp. 615-620, May 1999.[14] Zhaobin Wang, Yide Ma, Feiyan Cheng, LizhenYang, "Review of pulse-coupled neural networks,"**Image** and Vision Computing, Vol. 28, Issue 1, pp.5-13, Jan. 2010.[15] Michael A. Soderstrand, "CSD multipliers for FPGADSP applications," In Proc. Int. Symposium onCircuits and Systems, Vol. 5, pp. 469-472, June2003.[16] John L. Johnson, "Time signatures of images," InProc. IEEE Int. Conf. on **Neural** Networks, Vol. 2,pp. 1279-1284, Jun 1994.Volume 1, Issue 3, September – October 2012 Page 185