Invariant character recognition with Zernike and orthogonal Fourier ...

Pattern Recognition 35 (2002) 143}154Invariant character recognition with Zernike and orthogonalFourier}Mellin momentsChao Kan*, Mandyam D. SrinathDepartment of Electrical Engineering, Southern Methodist University, c/o. M.D. Srinath, Dallas, TX 75275, USAReceived and accepted 28 January 2000AbstractIn this paper, we consider the use of orthogonal moments for invariant classi"cation of alphanumeric characters ofdi!erent size. In addition to the Zernike and pseudo-Zernike moments (ZMs and PZMs) which have been previouslyproposed for invariant character recognition, a new method of combining Orthogonal Fourier}Mellin moments(OFMMs) with centroid bounding circle scaling is introduced, which is shown to be useful in characterizing images withlarge variability. Through extensive experimentation using ZMs and OFMMs as features, di!erent scaling methodologiesand classi"ers, it is shown that OFMMs give the best overall performance in terms of both image reconstruction andclassi"cation accuracy. 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.Keywords: Character recognition; Pattern recognition; Moments; Zernike; Fourier}Mellin1. IntroductionInvariant pattern recognition is a useful tool in machinevision applications such as automatic inspection.The performance of pattern recognition systems dependson the speci"c feature extraction technique used to representa pattern by a set of numerical features and toreduce the dimension of the feature vector by removingredundancy from the data. The selected feature sets mustpossess small intraclass variability and large interclassseparation. Classi"cation of a pattern regardless of itsorientation, size and location in the "eld of view requiresthat the selected features be invariant with respect torotation, scale and translation. While the type of featuresdepends on the speci"c patterns to be recognized, inrecent years, the use of moments has been proposed forinvariant recognition of alphanumeric characters [1}5].The use of geometric moments for the characterization oftwo-dimensional images was "rst introduced by Hu [6],who de"ned a class of moment invariants derived from* Corresponding author. Tel.: #1-972-996-4266; fax: #1-214-768-3573.E-mail address: kch@seas.smu.edu (C. Kan).the geometric moments, which are invariant to translationalshifts, changes of scale or rotations of the image.From the uniqueness theorem of moments, an image isuniquely determined by its geometrical moments of allorders. Low-order moments contain less informationabout image detail, while high-order moments are vulnerableto noise. The use of orthogonal moments makesit possible to describe an image with a "nite number ofmoments and get bene"t from the inclusion of high-ordermoments. Teh and Chin [7] evaluated various types ofimage moments, including moment invariants, Legendremoments, Zernike moments, pseudo-Zernike moments,Fourier}Mellin moments and complex moments, interms of noise sensitivity, information redundancy andcapability of image description. Bailey and Srinath[8] investigated invariant character recognition usingLegendre moments, Zernike moments, and pseudo-Zernike moments with di!erent classi"ers. They foundthat either Zernike moments or pseudo-Zernike momentshave the best overall performance. Khotanzad andHong [9] have shown that a neural network classi"erusing Zernike moments has very strong class separabilitypower.Zernike moments (ZMs) are orthogonal and rotationinvariant. But when they are used for scale invariant0031-3203/01/$20.00 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 1 7 9 - 5

144 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154pattern recognition, ZM's have di$culty in describing ZMs of an image are the projections of the image functionB "(!1)((p#k)/2)! ((p!k)/2)!((q#k)/2)!((k!q)/2)! . (3) Q (r)Q (r)r dr"a δ , (9) images of small size. In 1994, Sheng and Shen [10]showed that the orthogonal Fourier}Mellin moments onto these orthogonal basis functions. The ZM oforder p with repetition q for a digital image is de"ned as:(OFMMs), which can be thought of as generalized Zernikemoments or orthogonalized complex moments, are Z " p#1better able to describe images of small size in terms of π f (x, y)=H (r, θ)῀x῀y. (4) image reconstruction errors and signal-to-noise ratios. InHere, r is the length of the vector from origin to pixeladdition, the order of independent OFMMs required to(x, y), θ is the angle between vector r and the x-axis in therepresent an image is much lower than that of ZMs socounter-clockwise direction andthat OFMMs can be more robust than ZMs if the charactershave large variability.In this paper, we consider the use of OFMMs withx#y)1, x"r cos θ, y"r sin θ.di!erent scaling methodologies for invariant classi"cationIf the image is rotated through angle , the relationshipof alphanumeric characters from two di!erent image between Z and Z is [9] databases. The "rst database is the same as that used byKhotanzad and Hong in Ref. [9] and the second one wasZ "Z e(. (5) developed by the National Institute of Standards andThen Z , the magnitude of the Zernike moment, isTechnology (NIST) [11]. Simulation results for classi-a rotation-invariant feature of the underlying image."cation of the images in these two databases using bothThe computation time of ZMs can be reduced dramaticallyby using the explicit form of R (r) as shownZMs and OFMMs are presented and compared.While the scaling method used in Ref. [10] is to representOFMMs by Fourier}Mellin moments (FMMs) andin Table 1 instead of Eq. (1) for orders up to 12.achieve scale invariancy for OFMMs by normalizing theFMMs, this method can make the scale normalizedOFMMs very sensitive to image variability. The e!ects3. Orthogonal Fourier}Mellin momentsof di!erent scaling methods for achieving scale invariancyare studied and a new methodology of combiningIt has been pointed out that there exist a large numberof complete sets of polynomials that are rotation invariantand are orthogonal over the interior of the unit circleOFMMs and centroid bounding circle scaling is introduced,which achieves better classi"cation accuracy when[13]. Here, we consider the use of the orthogonalimages exhibit large intraclass variability. The paper isFourier}Mellin moments (OFMMs) introduced byorganized as follows. In Sections 2 and 3, ZM- andSheng and Shen [10] for character recognition, which areOFMM-based features are de"ned. Section 4 describesbased on a set of radial polynomials.the basic properties and performance comparison ofThe circular Fourier or radial Mellin momentsZMs and OFMMs. In Section 5 we present several(FMMs) of an image function f (r, θ) are de"ned in themethods to achieve shift, rotation, and scale invariancy.polar coordinate system (r, θ) asThe experimental results are shown in Section 6. Finally,some concluding remarks are given in Section 7.F " π rf (r, θ)er dr dθ, (6)2. Zernike momentswhere q"0,$1,$2,2 is the circular harmonic orderand the order of the Mellin radial transform is an integerZernike polynomials, pioneered by Teague [12] inimage analysis, form a complete orthogonal set over thep with p*0. We now introduce the polynomial Q (r) de"ned in Ref. [10] asinterior of the unit circle x#y"1. The Zernike functionof order (p, q) is de"ned in the polar coordinate Q (r)" α r system (r, θ) as(7)= (r,θ)"R (r)e, (1)withwhere(p#k#1)!α "(!1) (p!k)!k!(k#1)! . (8)R (r)" B r, (2) It can be veri"ed that the set Q (r) is orthogonal over therange 0)r)1 [10]:

C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 145Table 1Explicit form of R (r) when p is even or odd (p*q and p!q"even)(p, q) R (r) (p, q) R (r)(0, 0) 1 (1, 1) r(2, 0) 2r!1 (3, 1) 3r!2r(2, 2) r (3, 3) r(4, 0) 6r!6r (5, 1) 10r!12r#3r(4, 2) 4r!3r (5, 3) 5r!4r(4, 4) r (5, 5) r(6, 0) 20r!30r#12r (7, 1) 35r!60r#30r!4r(6, 2) 15r!20r#6r (7, 3) 21r!30r#10r(6, 4) 6r!5r (7, 5) 7r!6r(6, 6) r (7, 7) r(8, 0) 70r!140r#90r!20r#1 (9, 1) 126r!280r#210r!60r#5r(8, 2) 56r!105r#60r!10r (9, 3) 84r!168r#105r!20r(8, 4) 28r!42r#15r (9, 5) 36r!56r#21r(8, 6) 8r!7r (9, 7) 9r!8r(8, 8) r (9, 9) r(10, 0) 252r!630r#560r!210r#30r!1 (11, 1) 462r!1260r#1260r!560r#105r!6r(10, 2) 210r!504r#420r!140r#15r (11, 3) 330r!840r#756r!280r#35r(10, 4) 120r!252r#168r!35r (11, 5) 165r!360r#252r!56r(10, 6) 45r!72r#28r (11, 7) 55r!90r#36r(10, 8) 10r!9r (11, 9) 11r!10r(10,10) r (11, 11) r(12, 0) 924r!2772r#3150r!1680r#420r!42r#1(12, 2) 792r!2310r#2520r!1260r#280r!21r (12, 4) 495r!1320r#1260r!504r#70r(12, 6) 220r!495r#360r!84r (12, 8) 66r!110r#45r(12,10) 12r!11r (12,12) rwhere δ is the Kronecker delta, and a "1/[2(p#1)] isa normalization constant with r"1 as the maximumradial size of the underlying character.Then the (p, q) order OFMM function ; (x, y) and theOFMM moments O can be de"ned in the polar coordinatesystem (r, θ) as; (r,θ)"Q (r)e, (10)O " 1 2πa π f (r, θ); (r, θ)r dr dθ. (11) It follows from the above that the basis functions ; (r, θ)of the OFMMs are orthogonal over the interior of theunit circle.The discrete version of FMMs and OFMMs can beexpressed in rectangular coordinates (x, y) asF "O " p#1 π f (x, y)re῀x῀y, (12) f (x, y)Q (r)e῀x῀y, (13)x#y)1, x"r cos θ, y"r sin θ.The OFMMs are integrable when the degree p of Q (r) isp*0. By substituting Eqs. (6) and (7) into Eq. (11), wecan express the OFMMs as linear combinations ofFMMs:O " p#1πα F . (14)If the image is rotated through angle , the relationshipbetween F and F isF "F e( (15)so thatO "O e(. (16)It therefore follows that O , the magnitude of theOFMMs, is a rotation-invariant feature of the underlyingimage.By substituting Eq. (12) into (4), we note that ZMscan also be expressed as linear combinations of Fourier}Mellin moments asZ " p#1πwhere B is de"ned in Eq. (3).B F , (17)

146 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}1544. Image representation using OFMMs and ZMsOFMMs have a single orthogonal set of the radialpolynomials Q (r) for all circular harmonic orders q,while ZMs have one orthogonal set of radial polynomialsR(r) for each di!erent circular order q.Since Q (r) contains natural powers 1, r, r,2, r, theequation Q (r)"0 has p real and distinct roots in theinterior of the unit circle [14], while the polynomials ofZernike moments, R(r)"0 have (p!q)/2 duplicatedroots, apart from R(0)"0. Thus, for a given degreep and circular harmonic order q, Q (r) has p zeros, whileR(r) has (p!q)/2 zeros. It is known that the number ofzeros of the radial polynomials corresponds to the capabilityof the polynomials to describe high spatialfrequency components of the image. To have the samenumber p of zeros, the degree of R(r) has to be as high as 2p #q, much higher than the degree p of the OFMMs. Therefore, the degree p of Q (r) in the OFMMsrequired to represent an image can be much lower thanfor a representation using ZMs. The higher the degree,the more sensitive are the independent moments to variationand noise. This gives an advantage to OFMMsover ZMs in the case that the characters have largevariation and noise.In addition, ZMs focus on the global features andcatch less local information than OFMMs. As illustratedin Fig. 1, the zeros of Q (r) are nearly uniformly distrib-uted over the unit circle, whereas the zeros of R(r) arelocated in the region of large radial distance r from theorigin. The "rst zero of Q (r), which is closest to theorigin is at r"0.03, and the "rst zero of R (r) is atr"0.46. This di!erence is important, since in characterrecognition the character sizes are unknown a priori andthe moments of characters of di!erent sizes should becomputed with the same basis functions. Hence, Q (r)-based OFMMs are more suitable for describing smallimages and recognizing handwritten characters whichhave fairly large size variation.Notice that only approximately half of OFMMs andZMs for any order p will be independent. This is becauseO "OH and Z "ZH .In fact, the total number of independent OFMMs andZMs is given byN "(p#1) and N " (p#2)4(p even).In order to evaluate how well OFMMs represent animage in comparison with ZMs, we reconstruct the originalimage using truncated expansions of a "nite numberof both OFMMs and ZMs. The reconstructed imagefrom OFMMs and ZMs can be obtained byf (r, θ)" O Q (r)e, (18) f (r, θ)" Z R (r)e, (19) where N is the maximum order of moments which areused to reconstruct the image.As an example, consider an image of character `Bawith size of 6464. Figs. 2 and 3 show the 10 reconstructedimages obtained using di!erent orders of orthogonalFourier}Mellin and Zernike polynomials correspondingto a maximum of 100 for the number of respective moments.To compare the reconstruction results for smallcharacters, the original image of character `Ba is subsampledto the size of 1616. Figs. 4 and 5 show the 10reconstructed images of this subsampled character usingthe same number of independent OFMMs and ZMs as inFigs. 2 and 3. Note that all the reconstructed images havebeen thresholded.It is obvious that the OFMMs and ZMs have similarperformance for larger size of characters. However, forthe same character with size reduced to 1616, thequality of the reconstructed images using OFMMs isclearly better than that obtained using the same numberof ZMs. It can also be shown that OFMMs have betterperformance than ZMs in terms of noise sensitivity [10].5. Determination of invariant featuresTwo primary goals in designing a recognition systemare robustness and accuracy. The key to designing a goodcharacter recognition system is the choice of the featuresused to represent the input characters. If a good set offeatures can be found and extracted, the classi"cationproblem can be readily solved. Since in many cases, thecharacters may be of di!erent sizes and may be presentedto the classi"er at di!erent orientations, it is desirablethat these features be translation, rotation and scaleinvariant.5.1. Rotation and translation invarianceAs shown in previous sections, the magnitude ofOFMMs and ZMs, O and Z , are rotation invariantfeatures of the underlying images. Translation invariancyis achieved by transforming the origin of the image to thegeometric center before calculation of the moments.Given a two-dimensional MM image f (x, y), the imagegeometric center can be obtained byx " m m ,wherem " y " m m , xyf (x, y).

C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 147Fig. 1. (a) Q (r) of OFMMs (b) R(r) of ZMs. 5.2. Scale invariancyIn contrast with rotation and translation invariance,there are two methods for achieving scale invariance.(1) The "rst approach to scale invariance is accomplishedby enlarging or reducing each object such that pixelcoordinates of the object are mapped into the range ofthe unit circle. We take the centroid bounding (CB)circle for the object as this unit circle. Then "tting theobject to a circle is a matter of "nding the radius ofa circle, centered on the centroid, which just includesthe object. The centroid bounding circle can be obtainedas follows

148 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154Fig. 4. Reconstructed images of character `Ba with size 1616using OFMMs.Fig. 2. Reconstructed images of character `Ba with size 6464using OFMMs. From top left to bottom right are the originalimage and reconstructed images with OFMMs ofN"0, 1, 2, 3, 4, 5, 6, 7, 8, 9 corresponding to a total of 100 independentOFMMs.Fig. 5. Reconstructed images of character `Ba with size 1616using ZMs.y "y!y #0.5,r"x #y ifFig. 3. Reconstructed images of character `Ba with size 6464using ZMs. From top left to bottom right are the original imageand reconstructed images with ZMs of N"0, 2, 4, 6, 8, 10,12, 14, 16, 18 corresponding to a total of 100 independent ZMs.(a) Find the bounding box, and centroid (x , y ) of theunderlying image(b) Set r "0(c) Inside the bounding box, perform the followingsteps for each line For line y, "nd the left-most and right-mostnon-zero pixels: x , x . If there are no pixels on the line, skip to the next line. Computex "x !x ,i"1,2,x "max[x , x ]#0.5,(r'r ),set r "r. Continue with the next line(d) The maximum radius is r . This is used as thescaling factor.(2) The second approach is Fourier}Mellin (FM) scaling,which achieves scale invariance for OFMMs andZMs by normalizing the FMMs "rst [10]. As shownin Eqs. (14) and (17), both OFMMs and ZMs can beexpressed as linear combinations of FMMs. When animage f (r, θ) is scaled by a factor k, its FMMs becomeF " π rf (r/k, θ)er dr dθ"kF , (20) where F s are the Fourier}Mellin moments of theoriginal image f (r,θ). Normalization by the dominantfeature component F gives the scale normalizedFMMs asF" F " F . (21) F F Thus, after this normalization, both OFMMs andZMs, computed based on the normalized FMMs, arealso scale invariant. In addition, notice thatF"1 for all radial order p and this newzeroth circular-harmonic order feature should beomitted from the feature vector component.

C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 149Generally, we have to keep in mind that scaling invariancyis approximate since the image function is digitaland thus a digital image at a larger scale has more samplegrid points than the smaller one.The limitation with the "rst method is that if the objectis very small, then the centroid bounding circle processwhich normalizes the small image into the unit circle withrespect to large size image will cause a considerabledistortion to both original images and therefore theirextracted features. However, the accuracy of the secondapproach is determined by the denominator F . If thecharacters have large variability, this scaling method cancause even more distortion than the rescaling used in the"rst two methods. This is especially true for the handwrittencharacters. Therefore, each method has a di!erentimpact on recognition accuracy, and should be usedin the appropriate situation. In this paper, the centroidbounding circle scaling and the scaled FMMs scaling arestudied and compared.6. Experimental study6.1. Experimental databaseIn our experiment, we "rst select the same database asused in Ref. [9] to compare our results. This databaseconsists of 6464 binary images of all 26 printed uppercaseEnglish characters. The set of 24 images per characterconsists of six images with arbitrarily varying scales,orientations, and translations from each of the four consideredsilhouetts per character. The database of smallcharacters is obtained by decomposing the original imagesin this database into size of 3232 using waveletmultiresolution analysis.In order to compare the performance of both OFMMsand ZMs in the situation where the same character haslarge variation, we also use the popular database ofhandprinted characters provided by NIST [11]. Thisdatabase consists of 119,740 images of handprinteddigits, 24,205 lower case letters, and 24,420 upper caseletters. For convenience, only upper case letters are usedin our study.Generally, the "rst database of printed characters doesnot have much writing style variation, while the NISTdatabase consisting of handwritten characters collectedfrom 1000 di!erent writers shows considerable variability.However, unlike the "rst database, little rotation isinvolved in the NIST database except for some smallslant variation.6.2. ClassixersThe multilayer perceptron (MLP) classi"er [15], inwhich the number of nodes in the hidden layer wasallowed to vary, was used to classify the testing samplesin the "rst database. The classi"ers used for the NISTdatabase include the nearest neighbour (NN) [16], theminimum-mean-distance (MMD) [16] and MLP. Basedon the recommendation from NIST, the probabilisticneural network (PNN) classi"er [17] was also selected inthe experiments for the NIST database.In order to prevent one or a subgroup of feature vectorcomponents from dominating the distance measure, thefeatures were normalized by subtracting the sample meanand dividing by the standard deviation of each featurefrom the corresponding class.6.3. Pre-processingTo minimize the negative e!ect of character strokes in the"nal classi"cation, a skeletonization or thinning algorithmis used for the "rst database. The pre-processing for thecharacters from the NIST database includes the size, strokeand slant normalization recommended by NIST [11].Size normalization bounds the character data withina segmented image by a box, and that box is scaled to "texactly within a 2032 pixel region. A simple morphologicaloperator is applied to normalize the characterstroke. If the pixel content of a character image is signi"-cantly high, then the image is eroded and the strokes arethinned. If the pixel content of a character image issigni"cantly low, then the image is dilated and thestrokes are widened.Slant normalization is achieved by shifting the rows inthe image either left or right to straighten the character.Given a segmented character image, the top and bottomrows containing character pixels are located. The leftmostcharacter pixel is located in each of the two rows,and a linear shifting function is calculated to shift therows in the image so that when "nished, the leftmostpixels in the top and bottom rows line up in the samecolumn. The rows between the top and bottom are shiftedin lesser amounts based on the linear shifting function[11]. A slope factor f de"ning the linear shifting functionis calculated asf" t !b b !t .where t is the vertical position of the top row, b is thevertical position of the bottom row, t is the horizontalposition of the leftmost character pixel in the top row,and b is the horizontal position of the leftmost characterpixel in the bottom row. The slope factor is used tocompute a shift coe$cient as followss"(r!m) fwith r being a vertical row index in the image and m equalto the vertical middle of the image. This causes theshifting to be centered about the middle of the image.A positive value of the shift coe$cient causes a row to be

150 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154shifted s pixel positions to the right, and a negative valuecauses a row to be shifted s pixels to the left.6.4. Analysis of results6.4.1. Results based on the xrst databaseTwo sets of studies are carried out by using a trainingset that contains 16 skeletonized character images of size6464 of the 24 samples of all scales for each class.The "rst study uses the remaining 8 skeletonized samplesof the same size, 6464, for testing. Fig. 6 illustratesthe classi"cation accuracy as the number of selectedfeatures increases from four to the full set of respectivemoment features when the multilayer perceptron (MLP)classi"er is used. It can be seen that the classi"cationperformance using OFMMs as features is comparable toZM features in this case no matter what kind of scalingtechnique is used. This is because all trained and testedimages here are of large size, 6464, and the OFMMsand the ZMs have similar performance for describinglarger size of characters.The second study considers the case when smallimages of size 3232 are used for testing. Fig. 7demonstrates the classi"cation accuracy under similarsituations corresponding to the "rst case. It is seen thatusing OFMMs with FM scaling in this case achievesmuch better performance than ZMs, in that OFMMscontain some local information and can represent smallimages better than ZMs, as discussed previously. However,notice that both OFMMs and ZMs with CB scalinggive higher misclassi"cation than FM scaling in this casebecause the skelontonized small character images havevery few pixels and CB scaling, which normalizes thesmall image into the unit circle, causes more distortion ofthe original images, and therefore their extracted features,than FM scaling.The performance of the MLP classi"er under the twostudies is illustrated in Figs. 8 and 9 as the number ofhidden nodes increases from 8 to 100. As can be seen, thebest performance is obtained for the number of hiddennodes at around 20 for both OFMMs and ZMs. Noticethat OFMMs with FM scaling perform the best in thiscase when small size images are used for testing.6.4.2. Results based on the NIST databaseThe characters in the NIST database have very largevariation. The same characters printed by the samewriter can vary greatly in size, slant and stroke. Therefore,we design two cases to investigate the performanceof OFMM and ZM features using di!erent scalingmethods. In our experiments, 15,600 upper case imagesare used for training and 6500 images are used for testing.In the "rst case, all training and testing samples arepre-processed for size, slant and stroke normalization,while in the second case, training and testing samples areonly pre-processed for stroke normalization. Table 2gives the results of di!erent combinations of classi"ers,moment features and scaling methods under these twosituations.Fig. 6. Classi"cation result comparison between OFMMs and ZMs using both CB and FM scaling to achieve scale invariance. Sixteensamples per character with size 6464 are used for training, and the remaining eight large character samples are used for testing.

C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 151Fig. 7. Classi"cation result comparison between OFMMs and ZMs using both CB and FM scaling to achieve scale invariance. Sixteensamples per character with size 6464 are used for training, and the smaller images obtained from the remaining eight larger charactersamples are used for testing.Fig. 8. Classi"cation result comparison between OFMMs and ZMs using both CB and FM scaling to achieve scale invariance as thenumber of hidden nodes of MLP increases. Sixteen samples per character with size 6464 are used for training, and the remaining eightlarge character samples are used for testing.

152 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154Fig. 9. Classi"cation result comparison between OFMMs and ZMs using both CB and FM scaling to achieve scale invariance as thenumber of hidden nodes of MLP increases. Sixteen samples per character with size 6464 are used for training, and the smaller imagesobtained from the remaining eight larger character samples are used for testing.Table 2Classi"cation accuracy under di!erent classi"ers in the two cases for NIST database CB"centroid bounding circle scaling,FM"scaled Fourier}Mellin moment scalingClassi"er/ ZMs OFMMs ZMs OFMMsfeatures CB scaling CB scaling FM scaling FM scaling(%) (%) (%) (%)MMD 64.3 62.9 42.5 48.3Case 1 NN 79.2 77.8 56.7 61.1MLP 73.1 71.7 54.6 59.3PNN 82.1 81.3 51.4 56.5MMD 53.2 58.1 36.4 41.3Case 2 NN 63.2 67.1 46.2 51.5MLP 61.8 65.7 45.1 49.5PNN 67.7 72.9 48.6 53.4It can be seen that when training and testing samplesare size, slant and stroke normalized, the classi"cationperformance under CB scaling using OFMM features iscomparable to ZM features. However, ZM features,which focus on the global features, achieve better accuracythan OFMMs in this case. This is because theadvantage of OFMMs for small images is that OFMMsbetter describe these small images when they are relativelysmall with respect to other images within the databaseand they are normalized using the large image size. Here,all images have the same size and OFMMs lose theiradvantage. In addition, the large variability of the originalsamples is greatly reduced after size and slant normalizationso that the sensitivity of ZM's high ordermoments to variation has less impact on the classi"cationaccuracy.However, OFMM features under CB scaling give thebest overall results when training and testing samples areonly stroke normalized, because under this situation, thesize is not normalized and OFMM features can describe

C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 153small images more accurately. In addition, the largevariation of training and testing samples in this casemakes the centroid-bounding circle scaling better thanscaled FMM scaling in general. Also notice that thepre-processing for size and slant normalization does improvethe overall classi"cation performance for both ZMand OFMM features.7. ConclusionsWe have shown that orthogonal Fourier}Mellinmoments (OFMMs) are based on the radial polynomialsthat contain more local information about the underlyingcharacter images and can be used to represent smallcharacters in the same way as large characters, whileZernike moments are based on the polynomials whichcan capture more global information but describe smallcharacters less accurately. In addition, since the order ofindependent OFMMs required to represent an image ismuch lower than that of ZMs, the OFMMs are thereforemuch less sensitive to character variations and noise thanZMs.The comparison of their properties and classi"cationresults as well as image reconstruction using a "nitenumber of truncated moments shows that OFMMsdescribe small images more accurately and perform betterthan ZMs in classi"cation of handwritten characterswhen large size character samples are taken for trainingand relatively smaller size characters are used for testing.In addition, for larger character image testing set withless variation, OFMMs achieve very comparable performanceto Zernike moments.Extensive experimentation also shows that scalingmethodologies have a large impact on the "nal classi"cation.When images show a large variability, the newmethod of combining OFMMs with centroid-boundingcircle scaling behaves better than the FM scaling used inRef. [10] because the new method causes less distortionwhen the moments are computed and normalized. However,when characters with less variability are skeletonized,especially for small size characters, the FMscaling performs better since the re-scaling of fewer pixelsin the skeletonized small images introduces a lot of computationnoise in the respective moment features.In terms of overall performance on image reconstructionand classi"cation accuracy, orthogonal Fourier}Mellin moments prove to be better suited for characterrecognition systems for recognizing both large and smallcharacter images.8. SummaryThe performance of pattern recognition systems dependson the speci"c features used to represent a pattern.The selected feature sets must possess small intraclassvariability and large interclass separation. Classi"cationof a pattern regardless of its orientation, size and locationin the "eld of view requires that the selected features beinvariant with respect to rotation, scale and translation.In recent years, the use of geometric moments has beenproposed for invariant recognition of alphanumeric characters.The use of geometric moments for characterizationof two-dimensional images was "rst introduced byHu, who de"ned a class of moment invariants derivedfrom the geometric moments, which are invariant totranslational shifts, changes of scale or rotations of theimage.While an image is uniquely determined by its geometricalmoments of all orders, low-order moments containless information about image detail, while high-ordermoments are vulnerable to noise. The use of orthogonalmoments makes it possible to describe an image witha "nite number of moments and bene"t from the inclusionof high-order moments. Teh and Chin evaluatedvarious types of image moments, including momentinvariants, Legendre moments, Zernike moments,pseudo-Zernike moments, Fourier}Mellin moments andcomplex moments, in terms of noise sensitivity, informationredundancy and capability of image description.Bailey and Srinath investigated invariant character recognitionusing Legendre moments Zernike moments, andpseudo-Zernike moments with di!erent classi"ers. Theyfound that either Zernike moments or pseudo-Zernikemoments have the best overall performance. Khotanadand Hong have shown that a neural network classi"erusing Zernike moments has very strong class separabilitypower. Zernike moments are orthogonal and rotationinvariant. But when they are used for scale invariantpattern recognition, ZM's have di$culty in describingimages of small size. In 1994, Sheng and Shen showedthat the orthogonal Fourier}Mellin moments (OFMMs),which can be thought of as generalized Zernike momentsor orthogonalized complex moments, are better able todescribe images of small size in terms of image reconstructionerrors and signal-to-noise ratios. In addition,the order of independent OFMMs required to representan image is much lower than that of ZMs so thatOFMMs can be more robust than ZMs if the charactershave large variability.In this paper, we consider the use of OFMMs forinvariant classi"cation of alphanumeric characterschosen from two di!erent image databases. Simulationresults, using two di!erent classi"ers, for classi"cation ofthe images in these two databases using both ZMs andOFMMs are presented and compared. The e!ects ofdi!erent scaling methods for achieving scale invariancyare studied and a new methodology of combiningOFMMs and centroid bounding circle scaling is introduced.The simulation results show that orthogonalFourier}Mellin moments perform better than ZMs for

154 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154images of small size, particularly when images exhibitlarge intraclass variability.References[1] M. Fang, G. Hausler, Class of transforms invariantunder shift, rotation, and scaling, Appl. Opt. 29 (1990)704}708.[2] T.H. Reiss, The revised fundamental theorem of momentinvariants, IEEE Trans. Pattern Anal. Mach. Intell. 13(1991) 830}833.[3] L. Shen, Y. Sheng, Noncentral image moments forinvariant pattern recognition, Opt. Eng. 34 (1995)3181}3186.[4] M. Gruhen, K.Y. Hsu, Moment-based image normalizationwith high noise tolerance, IEEE Trans. Pattern Anal.Mach. Intell. 19 (1997) 136}139.[5] D.G. Shen, H.S. Horace, Generalized a$ne invariantimage normalization, IEEE Trans. Pattern Anal. Mach.Intell. 19 (1997) 431}440.[6] M.K. Hu, Visual pattern recognition by moment invariants,IRE Trans. Inform. Theory (1962) 179}187.[7] C.H. Teh, R.T. Chin, On image analysis by the methods ofmoments, IEEE Trans. Pattern Anal. Mach. Intell. 10(1988) 496}513.[8] R.R. Bailey, M.D. Srinath, Orthogonal moment featuresfor use with parametric and non-parametric classi"ers,IEEE Trans. Pattern Anal. Mach. Intell. 18 (1996).[9] A. Khotanzad, Y.H. Hong, Classi"cation of invariantimage representations using a neural network, IEEETrans. Acoust. Speech Signal Process. 38 (1990) 1028}1038.[10] Y. Sheng, L. Shen, Orthogonal Fourier}Mellin momentsfor invariant pattern recognition, J. Opt. Soc. Am. 11(1994) 1748}1757.[11] M.D. Garris, J.L. Blue, NIST Form-Based HandprintRecognition System, National Institute of Standards andTechnology, 1995.[12] M. Teague, Image analysis via the general theory ofmoments, J. Opt. Soc. Am. 70 (1980) 920}930.[13] A.B. Bhatia, E. Wolf, On the circular polynomials ofZernike and related orthogonal sets, Proc. CambridgePhilos. Soc. 50 (1954) 40}48.[14] D. Jackson, Fourier series and orthogonal polynomials,Mathematical Association of America, Chapters VII andXI 1941.[15] D.E. Rumelhart, J.L. McClelland, Parallel DistributedProcessing, Vol. 1: Foundations, MIT Press, Cambridge,MA, 1986.[16] K. Fukunaga, Introduction to Statistical Pattern Recognition,2nd edition, Academic Press, New York, 1990.[17] D.F. Specht, Probabilistic neural networks, NeuralNetworks 3 (1990) 109}118.About the Author*CHAO KAN received the B.S. and M.S degrees in Electrical Engineering from Nanjing University of Posts andTelecommunications, Nanjing, China, in 1988 and 1991, respectively. He obtained a Ph.D. in Electrical Engineering from SouthernMethodist University, Dallas, TX, in 2000.He joined the Research and Development group at Raytheon Systems Company in 1995, where he worked as a research engineer. Hehas been with Alcatel USA since 1999 as a research scientist in the Corporate Research Center. His current research interests includecomputer vision, image processing, wavelet, multimedia system and Internet network management.About the Author*MANDYAM D. SRINATH received the B.Sc. degree from the University of Mysore, India, in 1954, the Diploma inElectrical Technology from the Indian Institute of Science, Bangalore, India in 1957, and the M.S. and Ph.D. degrees in ElectricalEngineering from the University of Illinois, Urbana, in 1959 and 1962, respectively.He has been on the Electrical Engineering faculties at the University of Kansas, Lawrence and the Indian Institute of Science,Bangalore, India. He is currently Professor of Electrical Engineering at Southern Methodist University, Dallas, TX, where he has beensince 1967.He has published numerous papers in signal and image processing, control and estimation theory. He is principal author of the book`Introduction to Statistical Signal Processing with Applicationsa, (with P.K. Rajasekaran and R. Viswanathan), Prentice-Hall, 1996 andco-author of `Continuous and Discrete Signals and Systemsa, (with S. Soliman), Prentice-Hall, 1990, 1998. His current research interestsare in adaptive "lters, image processing, and video data compression.Dr. Srinath is Senior Member of the Institution of Electrical and Electronic Engineers and a Registered Professional Engineer inTexas. He is an Associate editor of Pattern Recognition.

Invariant character recognition with Zernike and orthogonal Fourier ...

Create successful ePaper yourself

Delete template?

Save as template?