13.07.2015 Views

Invariant character recognition with Zernike and orthogonal Fourier ...

Invariant character recognition with Zernike and orthogonal Fourier ...

Invariant character recognition with Zernike and orthogonal Fourier ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Pattern Recognition 35 (2002) 143}154<strong>Invariant</strong> <strong>character</strong> <strong>recognition</strong> <strong>with</strong> <strong>Zernike</strong> <strong>and</strong> <strong>orthogonal</strong><strong>Fourier</strong>}Mellin momentsChao Kan*, M<strong>and</strong>yam D. SrinathDepartment of Electrical Engineering, Southern Methodist University, c/o. M.D. Srinath, Dallas, TX 75275, USAReceived <strong>and</strong> accepted 28 January 2000AbstractIn this paper, we consider the use of <strong>orthogonal</strong> moments for invariant classi"cation of alphanumeric <strong>character</strong>s ofdi!erent size. In addition to the <strong>Zernike</strong> <strong>and</strong> pseudo-<strong>Zernike</strong> moments (ZMs <strong>and</strong> PZMs) which have been previouslyproposed for invariant <strong>character</strong> <strong>recognition</strong>, a new method of combining Orthogonal <strong>Fourier</strong>}Mellin moments(OFMMs) <strong>with</strong> centroid bounding circle scaling is introduced, which is shown to be useful in <strong>character</strong>izing images <strong>with</strong>large variability. Through extensive experimentation using ZMs <strong>and</strong> OFMMs as features, di!erent scaling methodologies<strong>and</strong> classi"ers, it is shown that OFMMs give the best overall performance in terms of both image reconstruction <strong>and</strong>classi"cation accuracy. 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.Keywords: Character <strong>recognition</strong>; Pattern <strong>recognition</strong>; Moments; <strong>Zernike</strong>; <strong>Fourier</strong>}Mellin1. Introduction<strong>Invariant</strong> pattern <strong>recognition</strong> is a useful tool in machinevision applications such as automatic inspection.The performance of pattern <strong>recognition</strong> systems dependson the speci"c feature extraction technique used to representa pattern by a set of numerical features <strong>and</strong> toreduce the dimension of the feature vector by removingredundancy from the data. The selected feature sets mustpossess small intraclass variability <strong>and</strong> large interclassseparation. Classi"cation of a pattern regardless of itsorientation, size <strong>and</strong> location in the "eld of view requiresthat the selected features be invariant <strong>with</strong> respect torotation, scale <strong>and</strong> translation. While the type of featuresdepends on the speci"c patterns to be recognized, inrecent years, the use of moments has been proposed forinvariant <strong>recognition</strong> of alphanumeric <strong>character</strong>s [1}5].The use of geometric moments for the <strong>character</strong>ization oftwo-dimensional images was "rst introduced by Hu [6],who de"ned a class of moment invariants derived from* Corresponding author. Tel.: #1-972-996-4266; fax: #1-214-768-3573.E-mail address: kch@seas.smu.edu (C. Kan).the geometric moments, which are invariant to translationalshifts, changes of scale or rotations of the image.From the uniqueness theorem of moments, an image isuniquely determined by its geometrical moments of allorders. Low-order moments contain less informationabout image detail, while high-order moments are vulnerableto noise. The use of <strong>orthogonal</strong> moments makesit possible to describe an image <strong>with</strong> a "nite number ofmoments <strong>and</strong> get bene"t from the inclusion of high-ordermoments. Teh <strong>and</strong> Chin [7] evaluated various types ofimage moments, including moment invariants, Legendremoments, <strong>Zernike</strong> moments, pseudo-<strong>Zernike</strong> moments,<strong>Fourier</strong>}Mellin moments <strong>and</strong> complex moments, interms of noise sensitivity, information redundancy <strong>and</strong>capability of image description. Bailey <strong>and</strong> Srinath[8] investigated invariant <strong>character</strong> <strong>recognition</strong> usingLegendre moments, <strong>Zernike</strong> moments, <strong>and</strong> pseudo-<strong>Zernike</strong> moments <strong>with</strong> di!erent classi"ers. They foundthat either <strong>Zernike</strong> moments or pseudo-<strong>Zernike</strong> momentshave the best overall performance. Khotanzad <strong>and</strong>Hong [9] have shown that a neural network classi"erusing <strong>Zernike</strong> moments has very strong class separabilitypower.<strong>Zernike</strong> moments (ZMs) are <strong>orthogonal</strong> <strong>and</strong> rotationinvariant. But when they are used for scale invariant0031-3203/01/$20.00 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 1 7 9 - 5


144 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154pattern <strong>recognition</strong>, ZM's have di$culty in describing ZMs of an image are the projections of the image functionB "(!1)((p#k)/2)! ((p!k)/2)!((q#k)/2)!((k!q)/2)! . (3) Q (r)Q (r)r dr"a δ , (9) images of small size. In 1994, Sheng <strong>and</strong> Shen [10]showed that the <strong>orthogonal</strong> <strong>Fourier</strong>}Mellin moments onto these <strong>orthogonal</strong> basis functions. The ZM oforder p <strong>with</strong> repetition q for a digital image is de"ned as:(OFMMs), which can be thought of as generalized <strong>Zernike</strong>moments or <strong>orthogonal</strong>ized complex moments, are Z " p#1better able to describe images of small size in terms of π f (x, y)=H (r, θ)῀x῀y. (4) image reconstruction errors <strong>and</strong> signal-to-noise ratios. InHere, r is the length of the vector from origin to pixeladdition, the order of independent OFMMs required to(x, y), θ is the angle between vector r <strong>and</strong> the x-axis in therepresent an image is much lower than that of ZMs socounter-clockwise direction <strong>and</strong>that OFMMs can be more robust than ZMs if the <strong>character</strong>shave large variability.In this paper, we consider the use of OFMMs <strong>with</strong>x#y)1, x"r cos θ, y"r sin θ.di!erent scaling methodologies for invariant classi"cationIf the image is rotated through angle , the relationshipof alphanumeric <strong>character</strong>s from two di!erent image between Z <strong>and</strong> Z is [9] databases. The "rst database is the same as that used byKhotanzad <strong>and</strong> Hong in Ref. [9] <strong>and</strong> the second one wasZ "Z e(. (5) developed by the National Institute of St<strong>and</strong>ards <strong>and</strong>Then Z , the magnitude of the <strong>Zernike</strong> moment, isTechnology (NIST) [11]. Simulation results for classi-a rotation-invariant feature of the underlying image."cation of the images in these two databases using bothThe computation time of ZMs can be reduced dramaticallyby using the explicit form of R (r) as shownZMs <strong>and</strong> OFMMs are presented <strong>and</strong> compared.While the scaling method used in Ref. [10] is to representOFMMs by <strong>Fourier</strong>}Mellin moments (FMMs) <strong>and</strong>in Table 1 instead of Eq. (1) for orders up to 12.achieve scale invariancy for OFMMs by normalizing theFMMs, this method can make the scale normalizedOFMMs very sensitive to image variability. The e!ects3. Orthogonal <strong>Fourier</strong>}Mellin momentsof di!erent scaling methods for achieving scale invariancyare studied <strong>and</strong> a new methodology of combiningIt has been pointed out that there exist a large numberof complete sets of polynomials that are rotation invariant<strong>and</strong> are <strong>orthogonal</strong> over the interior of the unit circleOFMMs <strong>and</strong> centroid bounding circle scaling is introduced,which achieves better classi"cation accuracy when[13]. Here, we consider the use of the <strong>orthogonal</strong>images exhibit large intraclass variability. The paper is<strong>Fourier</strong>}Mellin moments (OFMMs) introduced byorganized as follows. In Sections 2 <strong>and</strong> 3, ZM- <strong>and</strong>Sheng <strong>and</strong> Shen [10] for <strong>character</strong> <strong>recognition</strong>, which areOFMM-based features are de"ned. Section 4 describesbased on a set of radial polynomials.the basic properties <strong>and</strong> performance comparison ofThe circular <strong>Fourier</strong> or radial Mellin momentsZMs <strong>and</strong> OFMMs. In Section 5 we present several(FMMs) of an image function f (r, θ) are de"ned in themethods to achieve shift, rotation, <strong>and</strong> scale invariancy.polar coordinate system (r, θ) asThe experimental results are shown in Section 6. Finally,some concluding remarks are given in Section 7.F " π rf (r, θ)er dr dθ, (6)2. <strong>Zernike</strong> momentswhere q"0,$1,$2,2 is the circular harmonic order<strong>and</strong> the order of the Mellin radial transform is an integer<strong>Zernike</strong> polynomials, pioneered by Teague [12] inimage analysis, form a complete <strong>orthogonal</strong> set over thep <strong>with</strong> p*0. We now introduce the polynomial Q (r) de"ned in Ref. [10] asinterior of the unit circle x#y"1. The <strong>Zernike</strong> functionof order (p, q) is de"ned in the polar coordinate Q (r)" α r system (r, θ) as(7)= (r,θ)"R (r)e, (1)<strong>with</strong>where(p#k#1)!α "(!1) (p!k)!k!(k#1)! . (8)R (r)" B r, (2) It can be veri"ed that the set Q (r) is <strong>orthogonal</strong> over therange 0)r)1 [10]:


C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 145Table 1Explicit form of R (r) when p is even or odd (p*q <strong>and</strong> p!q"even)(p, q) R (r) (p, q) R (r)(0, 0) 1 (1, 1) r(2, 0) 2r!1 (3, 1) 3r!2r(2, 2) r (3, 3) r(4, 0) 6r!6r (5, 1) 10r!12r#3r(4, 2) 4r!3r (5, 3) 5r!4r(4, 4) r (5, 5) r(6, 0) 20r!30r#12r (7, 1) 35r!60r#30r!4r(6, 2) 15r!20r#6r (7, 3) 21r!30r#10r(6, 4) 6r!5r (7, 5) 7r!6r(6, 6) r (7, 7) r(8, 0) 70r!140r#90r!20r#1 (9, 1) 126r!280r#210r!60r#5r(8, 2) 56r!105r#60r!10r (9, 3) 84r!168r#105r!20r(8, 4) 28r!42r#15r (9, 5) 36r!56r#21r(8, 6) 8r!7r (9, 7) 9r!8r(8, 8) r (9, 9) r(10, 0) 252r!630r#560r!210r#30r!1 (11, 1) 462r!1260r#1260r!560r#105r!6r(10, 2) 210r!504r#420r!140r#15r (11, 3) 330r!840r#756r!280r#35r(10, 4) 120r!252r#168r!35r (11, 5) 165r!360r#252r!56r(10, 6) 45r!72r#28r (11, 7) 55r!90r#36r(10, 8) 10r!9r (11, 9) 11r!10r(10,10) r (11, 11) r(12, 0) 924r!2772r#3150r!1680r#420r!42r#1(12, 2) 792r!2310r#2520r!1260r#280r!21r (12, 4) 495r!1320r#1260r!504r#70r(12, 6) 220r!495r#360r!84r (12, 8) 66r!110r#45r(12,10) 12r!11r (12,12) rwhere δ is the Kronecker delta, <strong>and</strong> a "1/[2(p#1)] isa normalization constant <strong>with</strong> r"1 as the maximumradial size of the underlying <strong>character</strong>.Then the (p, q) order OFMM function ; (x, y) <strong>and</strong> theOFMM moments O can be de"ned in the polar coordinatesystem (r, θ) as; (r,θ)"Q (r)e, (10)O " 1 2πa π f (r, θ); (r, θ)r dr dθ. (11) It follows from the above that the basis functions ; (r, θ)of the OFMMs are <strong>orthogonal</strong> over the interior of theunit circle.The discrete version of FMMs <strong>and</strong> OFMMs can beexpressed in rectangular coordinates (x, y) asF "O " p#1 π f (x, y)re῀x῀y, (12) f (x, y)Q (r)e῀x῀y, (13)x#y)1, x"r cos θ, y"r sin θ.The OFMMs are integrable when the degree p of Q (r) isp*0. By substituting Eqs. (6) <strong>and</strong> (7) into Eq. (11), wecan express the OFMMs as linear combinations ofFMMs:O " p#1πα F . (14)If the image is rotated through angle , the relationshipbetween F <strong>and</strong> F isF "F e( (15)so thatO "O e(. (16)It therefore follows that O , the magnitude of theOFMMs, is a rotation-invariant feature of the underlyingimage.By substituting Eq. (12) into (4), we note that ZMscan also be expressed as linear combinations of <strong>Fourier</strong>}Mellin moments asZ " p#1πwhere B is de"ned in Eq. (3).B F , (17)


146 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}1544. Image representation using OFMMs <strong>and</strong> ZMsOFMMs have a single <strong>orthogonal</strong> set of the radialpolynomials Q (r) for all circular harmonic orders q,while ZMs have one <strong>orthogonal</strong> set of radial polynomialsR(r) for each di!erent circular order q.Since Q (r) contains natural powers 1, r, r,2, r, theequation Q (r)"0 has p real <strong>and</strong> distinct roots in theinterior of the unit circle [14], while the polynomials of<strong>Zernike</strong> moments, R(r)"0 have (p!q)/2 duplicatedroots, apart from R(0)"0. Thus, for a given degreep <strong>and</strong> circular harmonic order q, Q (r) has p zeros, whileR(r) has (p!q)/2 zeros. It is known that the number ofzeros of the radial polynomials corresponds to the capabilityof the polynomials to describe high spatialfrequency components of the image. To have the samenumber p of zeros, the degree of R(r) has to be as high as 2p #q, much higher than the degree p of the OFMMs. Therefore, the degree p of Q (r) in the OFMMsrequired to represent an image can be much lower thanfor a representation using ZMs. The higher the degree,the more sensitive are the independent moments to variation<strong>and</strong> noise. This gives an advantage to OFMMsover ZMs in the case that the <strong>character</strong>s have largevariation <strong>and</strong> noise.In addition, ZMs focus on the global features <strong>and</strong>catch less local information than OFMMs. As illustratedin Fig. 1, the zeros of Q (r) are nearly uniformly distrib-uted over the unit circle, whereas the zeros of R(r) arelocated in the region of large radial distance r from theorigin. The "rst zero of Q (r), which is closest to theorigin is at r"0.03, <strong>and</strong> the "rst zero of R (r) is atr"0.46. This di!erence is important, since in <strong>character</strong><strong>recognition</strong> the <strong>character</strong> sizes are unknown a priori <strong>and</strong>the moments of <strong>character</strong>s of di!erent sizes should becomputed <strong>with</strong> the same basis functions. Hence, Q (r)-based OFMMs are more suitable for describing smallimages <strong>and</strong> recognizing h<strong>and</strong>written <strong>character</strong>s whichhave fairly large size variation.Notice that only approximately half of OFMMs <strong>and</strong>ZMs for any order p will be independent. This is becauseO "OH <strong>and</strong> Z "ZH .In fact, the total number of independent OFMMs <strong>and</strong>ZMs is given byN "(p#1) <strong>and</strong> N " (p#2)4(p even).In order to evaluate how well OFMMs represent animage in comparison <strong>with</strong> ZMs, we reconstruct the originalimage using truncated expansions of a "nite numberof both OFMMs <strong>and</strong> ZMs. The reconstructed imagefrom OFMMs <strong>and</strong> ZMs can be obtained byf (r, θ)" O Q (r)e, (18) f (r, θ)" Z R (r)e, (19) where N is the maximum order of moments which areused to reconstruct the image.As an example, consider an image of <strong>character</strong> `Ba<strong>with</strong> size of 6464. Figs. 2 <strong>and</strong> 3 show the 10 reconstructedimages obtained using di!erent orders of <strong>orthogonal</strong><strong>Fourier</strong>}Mellin <strong>and</strong> <strong>Zernike</strong> polynomials correspondingto a maximum of 100 for the number of respective moments.To compare the reconstruction results for small<strong>character</strong>s, the original image of <strong>character</strong> `Ba is subsampledto the size of 1616. Figs. 4 <strong>and</strong> 5 show the 10reconstructed images of this subsampled <strong>character</strong> usingthe same number of independent OFMMs <strong>and</strong> ZMs as inFigs. 2 <strong>and</strong> 3. Note that all the reconstructed images havebeen thresholded.It is obvious that the OFMMs <strong>and</strong> ZMs have similarperformance for larger size of <strong>character</strong>s. However, forthe same <strong>character</strong> <strong>with</strong> size reduced to 1616, thequality of the reconstructed images using OFMMs isclearly better than that obtained using the same numberof ZMs. It can also be shown that OFMMs have betterperformance than ZMs in terms of noise sensitivity [10].5. Determination of invariant featuresTwo primary goals in designing a <strong>recognition</strong> systemare robustness <strong>and</strong> accuracy. The key to designing a good<strong>character</strong> <strong>recognition</strong> system is the choice of the featuresused to represent the input <strong>character</strong>s. If a good set offeatures can be found <strong>and</strong> extracted, the classi"cationproblem can be readily solved. Since in many cases, the<strong>character</strong>s may be of di!erent sizes <strong>and</strong> may be presentedto the classi"er at di!erent orientations, it is desirablethat these features be translation, rotation <strong>and</strong> scaleinvariant.5.1. Rotation <strong>and</strong> translation invarianceAs shown in previous sections, the magnitude ofOFMMs <strong>and</strong> ZMs, O <strong>and</strong> Z , are rotation invariantfeatures of the underlying images. Translation invariancyis achieved by transforming the origin of the image to thegeometric center before calculation of the moments.Given a two-dimensional MM image f (x, y), the imagegeometric center can be obtained byx " m m ,wherem " y " m m , xyf (x, y).


C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 147Fig. 1. (a) Q (r) of OFMMs (b) R(r) of ZMs. 5.2. Scale invariancyIn contrast <strong>with</strong> rotation <strong>and</strong> translation invariance,there are two methods for achieving scale invariance.(1) The "rst approach to scale invariance is accomplishedby enlarging or reducing each object such that pixelcoordinates of the object are mapped into the range ofthe unit circle. We take the centroid bounding (CB)circle for the object as this unit circle. Then "tting theobject to a circle is a matter of "nding the radius ofa circle, centered on the centroid, which just includesthe object. The centroid bounding circle can be obtainedas follows


148 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154Fig. 4. Reconstructed images of <strong>character</strong> `Ba <strong>with</strong> size 1616using OFMMs.Fig. 2. Reconstructed images of <strong>character</strong> `Ba <strong>with</strong> size 6464using OFMMs. From top left to bottom right are the originalimage <strong>and</strong> reconstructed images <strong>with</strong> OFMMs ofN"0, 1, 2, 3, 4, 5, 6, 7, 8, 9 corresponding to a total of 100 independentOFMMs.Fig. 5. Reconstructed images of <strong>character</strong> `Ba <strong>with</strong> size 1616using ZMs.y "y!y #0.5,r"x #y ifFig. 3. Reconstructed images of <strong>character</strong> `Ba <strong>with</strong> size 6464using ZMs. From top left to bottom right are the original image<strong>and</strong> reconstructed images <strong>with</strong> ZMs of N"0, 2, 4, 6, 8, 10,12, 14, 16, 18 corresponding to a total of 100 independent ZMs.(a) Find the bounding box, <strong>and</strong> centroid (x , y ) of theunderlying image(b) Set r "0(c) Inside the bounding box, perform the followingsteps for each line For line y, "nd the left-most <strong>and</strong> right-mostnon-zero pixels: x , x . If there are no pixels on the line, skip to the next line. Computex "x !x ,i"1,2,x "max[x , x ]#0.5,(r'r ),set r "r. Continue <strong>with</strong> the next line(d) The maximum radius is r . This is used as thescaling factor.(2) The second approach is <strong>Fourier</strong>}Mellin (FM) scaling,which achieves scale invariance for OFMMs <strong>and</strong>ZMs by normalizing the FMMs "rst [10]. As shownin Eqs. (14) <strong>and</strong> (17), both OFMMs <strong>and</strong> ZMs can beexpressed as linear combinations of FMMs. When animage f (r, θ) is scaled by a factor k, its FMMs becomeF " π rf (r/k, θ)er dr dθ"kF , (20) where F s are the <strong>Fourier</strong>}Mellin moments of theoriginal image f (r,θ). Normalization by the dominantfeature component F gives the scale normalizedFMMs asF" F " F . (21) F F Thus, after this normalization, both OFMMs <strong>and</strong>ZMs, computed based on the normalized FMMs, arealso scale invariant. In addition, notice thatF"1 for all radial order p <strong>and</strong> this newzeroth circular-harmonic order feature should beomitted from the feature vector component.


C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 149Generally, we have to keep in mind that scaling invariancyis approximate since the image function is digital<strong>and</strong> thus a digital image at a larger scale has more samplegrid points than the smaller one.The limitation <strong>with</strong> the "rst method is that if the objectis very small, then the centroid bounding circle processwhich normalizes the small image into the unit circle <strong>with</strong>respect to large size image will cause a considerabledistortion to both original images <strong>and</strong> therefore theirextracted features. However, the accuracy of the secondapproach is determined by the denominator F . If the<strong>character</strong>s have large variability, this scaling method cancause even more distortion than the rescaling used in the"rst two methods. This is especially true for the h<strong>and</strong>written<strong>character</strong>s. Therefore, each method has a di!erentimpact on <strong>recognition</strong> accuracy, <strong>and</strong> should be usedin the appropriate situation. In this paper, the centroidbounding circle scaling <strong>and</strong> the scaled FMMs scaling arestudied <strong>and</strong> compared.6. Experimental study6.1. Experimental databaseIn our experiment, we "rst select the same database asused in Ref. [9] to compare our results. This databaseconsists of 6464 binary images of all 26 printed uppercaseEnglish <strong>character</strong>s. The set of 24 images per <strong>character</strong>consists of six images <strong>with</strong> arbitrarily varying scales,orientations, <strong>and</strong> translations from each of the four consideredsilhouetts per <strong>character</strong>. The database of small<strong>character</strong>s is obtained by decomposing the original imagesin this database into size of 3232 using waveletmultiresolution analysis.In order to compare the performance of both OFMMs<strong>and</strong> ZMs in the situation where the same <strong>character</strong> haslarge variation, we also use the popular database ofh<strong>and</strong>printed <strong>character</strong>s provided by NIST [11]. Thisdatabase consists of 119,740 images of h<strong>and</strong>printeddigits, 24,205 lower case letters, <strong>and</strong> 24,420 upper caseletters. For convenience, only upper case letters are usedin our study.Generally, the "rst database of printed <strong>character</strong>s doesnot have much writing style variation, while the NISTdatabase consisting of h<strong>and</strong>written <strong>character</strong>s collectedfrom 1000 di!erent writers shows considerable variability.However, unlike the "rst database, little rotation isinvolved in the NIST database except for some smallslant variation.6.2. ClassixersThe multilayer perceptron (MLP) classi"er [15], inwhich the number of nodes in the hidden layer wasallowed to vary, was used to classify the testing samplesin the "rst database. The classi"ers used for the NISTdatabase include the nearest neighbour (NN) [16], theminimum-mean-distance (MMD) [16] <strong>and</strong> MLP. Basedon the recommendation from NIST, the probabilisticneural network (PNN) classi"er [17] was also selected inthe experiments for the NIST database.In order to prevent one or a subgroup of feature vectorcomponents from dominating the distance measure, thefeatures were normalized by subtracting the sample mean<strong>and</strong> dividing by the st<strong>and</strong>ard deviation of each featurefrom the corresponding class.6.3. Pre-processingTo minimize the negative e!ect of <strong>character</strong> strokes in the"nal classi"cation, a skeletonization or thinning algorithmis used for the "rst database. The pre-processing for the<strong>character</strong>s from the NIST database includes the size, stroke<strong>and</strong> slant normalization recommended by NIST [11].Size normalization bounds the <strong>character</strong> data <strong>with</strong>ina segmented image by a box, <strong>and</strong> that box is scaled to "texactly <strong>with</strong>in a 2032 pixel region. A simple morphologicaloperator is applied to normalize the <strong>character</strong>stroke. If the pixel content of a <strong>character</strong> image is signi"-cantly high, then the image is eroded <strong>and</strong> the strokes arethinned. If the pixel content of a <strong>character</strong> image issigni"cantly low, then the image is dilated <strong>and</strong> thestrokes are widened.Slant normalization is achieved by shifting the rows inthe image either left or right to straighten the <strong>character</strong>.Given a segmented <strong>character</strong> image, the top <strong>and</strong> bottomrows containing <strong>character</strong> pixels are located. The leftmost<strong>character</strong> pixel is located in each of the two rows,<strong>and</strong> a linear shifting function is calculated to shift therows in the image so that when "nished, the leftmostpixels in the top <strong>and</strong> bottom rows line up in the samecolumn. The rows between the top <strong>and</strong> bottom are shiftedin lesser amounts based on the linear shifting function[11]. A slope factor f de"ning the linear shifting functionis calculated asf" t !b b !t .where t is the vertical position of the top row, b is thevertical position of the bottom row, t is the horizontalposition of the leftmost <strong>character</strong> pixel in the top row,<strong>and</strong> b is the horizontal position of the leftmost <strong>character</strong>pixel in the bottom row. The slope factor is used tocompute a shift coe$cient as followss"(r!m) f<strong>with</strong> r being a vertical row index in the image <strong>and</strong> m equalto the vertical middle of the image. This causes theshifting to be centered about the middle of the image.A positive value of the shift coe$cient causes a row to be


150 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154shifted s pixel positions to the right, <strong>and</strong> a negative valuecauses a row to be shifted s pixels to the left.6.4. Analysis of results6.4.1. Results based on the xrst databaseTwo sets of studies are carried out by using a trainingset that contains 16 skeletonized <strong>character</strong> images of size6464 of the 24 samples of all scales for each class.The "rst study uses the remaining 8 skeletonized samplesof the same size, 6464, for testing. Fig. 6 illustratesthe classi"cation accuracy as the number of selectedfeatures increases from four to the full set of respectivemoment features when the multilayer perceptron (MLP)classi"er is used. It can be seen that the classi"cationperformance using OFMMs as features is comparable toZM features in this case no matter what kind of scalingtechnique is used. This is because all trained <strong>and</strong> testedimages here are of large size, 6464, <strong>and</strong> the OFMMs<strong>and</strong> the ZMs have similar performance for describinglarger size of <strong>character</strong>s.The second study considers the case when smallimages of size 3232 are used for testing. Fig. 7demonstrates the classi"cation accuracy under similarsituations corresponding to the "rst case. It is seen thatusing OFMMs <strong>with</strong> FM scaling in this case achievesmuch better performance than ZMs, in that OFMMscontain some local information <strong>and</strong> can represent smallimages better than ZMs, as discussed previously. However,notice that both OFMMs <strong>and</strong> ZMs <strong>with</strong> CB scalinggive higher misclassi"cation than FM scaling in this casebecause the skelontonized small <strong>character</strong> images havevery few pixels <strong>and</strong> CB scaling, which normalizes thesmall image into the unit circle, causes more distortion ofthe original images, <strong>and</strong> therefore their extracted features,than FM scaling.The performance of the MLP classi"er under the twostudies is illustrated in Figs. 8 <strong>and</strong> 9 as the number ofhidden nodes increases from 8 to 100. As can be seen, thebest performance is obtained for the number of hiddennodes at around 20 for both OFMMs <strong>and</strong> ZMs. Noticethat OFMMs <strong>with</strong> FM scaling perform the best in thiscase when small size images are used for testing.6.4.2. Results based on the NIST databaseThe <strong>character</strong>s in the NIST database have very largevariation. The same <strong>character</strong>s printed by the samewriter can vary greatly in size, slant <strong>and</strong> stroke. Therefore,we design two cases to investigate the performanceof OFMM <strong>and</strong> ZM features using di!erent scalingmethods. In our experiments, 15,600 upper case imagesare used for training <strong>and</strong> 6500 images are used for testing.In the "rst case, all training <strong>and</strong> testing samples arepre-processed for size, slant <strong>and</strong> stroke normalization,while in the second case, training <strong>and</strong> testing samples areonly pre-processed for stroke normalization. Table 2gives the results of di!erent combinations of classi"ers,moment features <strong>and</strong> scaling methods under these twosituations.Fig. 6. Classi"cation result comparison between OFMMs <strong>and</strong> ZMs using both CB <strong>and</strong> FM scaling to achieve scale invariance. Sixteensamples per <strong>character</strong> <strong>with</strong> size 6464 are used for training, <strong>and</strong> the remaining eight large <strong>character</strong> samples are used for testing.


C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 151Fig. 7. Classi"cation result comparison between OFMMs <strong>and</strong> ZMs using both CB <strong>and</strong> FM scaling to achieve scale invariance. Sixteensamples per <strong>character</strong> <strong>with</strong> size 6464 are used for training, <strong>and</strong> the smaller images obtained from the remaining eight larger <strong>character</strong>samples are used for testing.Fig. 8. Classi"cation result comparison between OFMMs <strong>and</strong> ZMs using both CB <strong>and</strong> FM scaling to achieve scale invariance as thenumber of hidden nodes of MLP increases. Sixteen samples per <strong>character</strong> <strong>with</strong> size 6464 are used for training, <strong>and</strong> the remaining eightlarge <strong>character</strong> samples are used for testing.


152 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154Fig. 9. Classi"cation result comparison between OFMMs <strong>and</strong> ZMs using both CB <strong>and</strong> FM scaling to achieve scale invariance as thenumber of hidden nodes of MLP increases. Sixteen samples per <strong>character</strong> <strong>with</strong> size 6464 are used for training, <strong>and</strong> the smaller imagesobtained from the remaining eight larger <strong>character</strong> samples are used for testing.Table 2Classi"cation accuracy under di!erent classi"ers in the two cases for NIST database CB"centroid bounding circle scaling,FM"scaled <strong>Fourier</strong>}Mellin moment scalingClassi"er/ ZMs OFMMs ZMs OFMMsfeatures CB scaling CB scaling FM scaling FM scaling(%) (%) (%) (%)MMD 64.3 62.9 42.5 48.3Case 1 NN 79.2 77.8 56.7 61.1MLP 73.1 71.7 54.6 59.3PNN 82.1 81.3 51.4 56.5MMD 53.2 58.1 36.4 41.3Case 2 NN 63.2 67.1 46.2 51.5MLP 61.8 65.7 45.1 49.5PNN 67.7 72.9 48.6 53.4It can be seen that when training <strong>and</strong> testing samplesare size, slant <strong>and</strong> stroke normalized, the classi"cationperformance under CB scaling using OFMM features iscomparable to ZM features. However, ZM features,which focus on the global features, achieve better accuracythan OFMMs in this case. This is because theadvantage of OFMMs for small images is that OFMMsbetter describe these small images when they are relativelysmall <strong>with</strong> respect to other images <strong>with</strong>in the database<strong>and</strong> they are normalized using the large image size. Here,all images have the same size <strong>and</strong> OFMMs lose theiradvantage. In addition, the large variability of the originalsamples is greatly reduced after size <strong>and</strong> slant normalizationso that the sensitivity of ZM's high ordermoments to variation has less impact on the classi"cationaccuracy.However, OFMM features under CB scaling give thebest overall results when training <strong>and</strong> testing samples areonly stroke normalized, because under this situation, thesize is not normalized <strong>and</strong> OFMM features can describe


C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154 153small images more accurately. In addition, the largevariation of training <strong>and</strong> testing samples in this casemakes the centroid-bounding circle scaling better thanscaled FMM scaling in general. Also notice that thepre-processing for size <strong>and</strong> slant normalization does improvethe overall classi"cation performance for both ZM<strong>and</strong> OFMM features.7. ConclusionsWe have shown that <strong>orthogonal</strong> <strong>Fourier</strong>}Mellinmoments (OFMMs) are based on the radial polynomialsthat contain more local information about the underlying<strong>character</strong> images <strong>and</strong> can be used to represent small<strong>character</strong>s in the same way as large <strong>character</strong>s, while<strong>Zernike</strong> moments are based on the polynomials whichcan capture more global information but describe small<strong>character</strong>s less accurately. In addition, since the order ofindependent OFMMs required to represent an image ismuch lower than that of ZMs, the OFMMs are thereforemuch less sensitive to <strong>character</strong> variations <strong>and</strong> noise thanZMs.The comparison of their properties <strong>and</strong> classi"cationresults as well as image reconstruction using a "nitenumber of truncated moments shows that OFMMsdescribe small images more accurately <strong>and</strong> perform betterthan ZMs in classi"cation of h<strong>and</strong>written <strong>character</strong>swhen large size <strong>character</strong> samples are taken for training<strong>and</strong> relatively smaller size <strong>character</strong>s are used for testing.In addition, for larger <strong>character</strong> image testing set <strong>with</strong>less variation, OFMMs achieve very comparable performanceto <strong>Zernike</strong> moments.Extensive experimentation also shows that scalingmethodologies have a large impact on the "nal classi"cation.When images show a large variability, the newmethod of combining OFMMs <strong>with</strong> centroid-boundingcircle scaling behaves better than the FM scaling used inRef. [10] because the new method causes less distortionwhen the moments are computed <strong>and</strong> normalized. However,when <strong>character</strong>s <strong>with</strong> less variability are skeletonized,especially for small size <strong>character</strong>s, the FMscaling performs better since the re-scaling of fewer pixelsin the skeletonized small images introduces a lot of computationnoise in the respective moment features.In terms of overall performance on image reconstruction<strong>and</strong> classi"cation accuracy, <strong>orthogonal</strong> <strong>Fourier</strong>}Mellin moments prove to be better suited for <strong>character</strong><strong>recognition</strong> systems for recognizing both large <strong>and</strong> small<strong>character</strong> images.8. SummaryThe performance of pattern <strong>recognition</strong> systems dependson the speci"c features used to represent a pattern.The selected feature sets must possess small intraclassvariability <strong>and</strong> large interclass separation. Classi"cationof a pattern regardless of its orientation, size <strong>and</strong> locationin the "eld of view requires that the selected features beinvariant <strong>with</strong> respect to rotation, scale <strong>and</strong> translation.In recent years, the use of geometric moments has beenproposed for invariant <strong>recognition</strong> of alphanumeric <strong>character</strong>s.The use of geometric moments for <strong>character</strong>izationof two-dimensional images was "rst introduced byHu, who de"ned a class of moment invariants derivedfrom the geometric moments, which are invariant totranslational shifts, changes of scale or rotations of theimage.While an image is uniquely determined by its geometricalmoments of all orders, low-order moments containless information about image detail, while high-ordermoments are vulnerable to noise. The use of <strong>orthogonal</strong>moments makes it possible to describe an image <strong>with</strong>a "nite number of moments <strong>and</strong> bene"t from the inclusionof high-order moments. Teh <strong>and</strong> Chin evaluatedvarious types of image moments, including momentinvariants, Legendre moments, <strong>Zernike</strong> moments,pseudo-<strong>Zernike</strong> moments, <strong>Fourier</strong>}Mellin moments <strong>and</strong>complex moments, in terms of noise sensitivity, informationredundancy <strong>and</strong> capability of image description.Bailey <strong>and</strong> Srinath investigated invariant <strong>character</strong> <strong>recognition</strong>using Legendre moments <strong>Zernike</strong> moments, <strong>and</strong>pseudo-<strong>Zernike</strong> moments <strong>with</strong> di!erent classi"ers. Theyfound that either <strong>Zernike</strong> moments or pseudo-<strong>Zernike</strong>moments have the best overall performance. Khotanad<strong>and</strong> Hong have shown that a neural network classi"erusing <strong>Zernike</strong> moments has very strong class separabilitypower. <strong>Zernike</strong> moments are <strong>orthogonal</strong> <strong>and</strong> rotationinvariant. But when they are used for scale invariantpattern <strong>recognition</strong>, ZM's have di$culty in describingimages of small size. In 1994, Sheng <strong>and</strong> Shen showedthat the <strong>orthogonal</strong> <strong>Fourier</strong>}Mellin moments (OFMMs),which can be thought of as generalized <strong>Zernike</strong> momentsor <strong>orthogonal</strong>ized complex moments, are better able todescribe images of small size in terms of image reconstructionerrors <strong>and</strong> signal-to-noise ratios. In addition,the order of independent OFMMs required to representan image is much lower than that of ZMs so thatOFMMs can be more robust than ZMs if the <strong>character</strong>shave large variability.In this paper, we consider the use of OFMMs forinvariant classi"cation of alphanumeric <strong>character</strong>schosen from two di!erent image databases. Simulationresults, using two di!erent classi"ers, for classi"cation ofthe images in these two databases using both ZMs <strong>and</strong>OFMMs are presented <strong>and</strong> compared. The e!ects ofdi!erent scaling methods for achieving scale invariancyare studied <strong>and</strong> a new methodology of combiningOFMMs <strong>and</strong> centroid bounding circle scaling is introduced.The simulation results show that <strong>orthogonal</strong><strong>Fourier</strong>}Mellin moments perform better than ZMs for


154 C. Kan, M.D. Srinath / Pattern Recognition 35 (2002) 143}154images of small size, particularly when images exhibitlarge intraclass variability.References[1] M. Fang, G. Hausler, Class of transforms invariantunder shift, rotation, <strong>and</strong> scaling, Appl. Opt. 29 (1990)704}708.[2] T.H. Reiss, The revised fundamental theorem of momentinvariants, IEEE Trans. Pattern Anal. Mach. Intell. 13(1991) 830}833.[3] L. Shen, Y. Sheng, Noncentral image moments forinvariant pattern <strong>recognition</strong>, Opt. Eng. 34 (1995)3181}3186.[4] M. Gruhen, K.Y. Hsu, Moment-based image normalization<strong>with</strong> high noise tolerance, IEEE Trans. Pattern Anal.Mach. Intell. 19 (1997) 136}139.[5] D.G. Shen, H.S. Horace, Generalized a$ne invariantimage normalization, IEEE Trans. Pattern Anal. Mach.Intell. 19 (1997) 431}440.[6] M.K. Hu, Visual pattern <strong>recognition</strong> by moment invariants,IRE Trans. Inform. Theory (1962) 179}187.[7] C.H. Teh, R.T. Chin, On image analysis by the methods ofmoments, IEEE Trans. Pattern Anal. Mach. Intell. 10(1988) 496}513.[8] R.R. Bailey, M.D. Srinath, Orthogonal moment featuresfor use <strong>with</strong> parametric <strong>and</strong> non-parametric classi"ers,IEEE Trans. Pattern Anal. Mach. Intell. 18 (1996).[9] A. Khotanzad, Y.H. Hong, Classi"cation of invariantimage representations using a neural network, IEEETrans. Acoust. Speech Signal Process. 38 (1990) 1028}1038.[10] Y. Sheng, L. Shen, Orthogonal <strong>Fourier</strong>}Mellin momentsfor invariant pattern <strong>recognition</strong>, J. Opt. Soc. Am. 11(1994) 1748}1757.[11] M.D. Garris, J.L. Blue, NIST Form-Based H<strong>and</strong>printRecognition System, National Institute of St<strong>and</strong>ards <strong>and</strong>Technology, 1995.[12] M. Teague, Image analysis via the general theory ofmoments, J. Opt. Soc. Am. 70 (1980) 920}930.[13] A.B. Bhatia, E. Wolf, On the circular polynomials of<strong>Zernike</strong> <strong>and</strong> related <strong>orthogonal</strong> sets, Proc. CambridgePhilos. Soc. 50 (1954) 40}48.[14] D. Jackson, <strong>Fourier</strong> series <strong>and</strong> <strong>orthogonal</strong> polynomials,Mathematical Association of America, Chapters VII <strong>and</strong>XI 1941.[15] D.E. Rumelhart, J.L. McClell<strong>and</strong>, Parallel DistributedProcessing, Vol. 1: Foundations, MIT Press, Cambridge,MA, 1986.[16] K. Fukunaga, Introduction to Statistical Pattern Recognition,2nd edition, Academic Press, New York, 1990.[17] D.F. Specht, Probabilistic neural networks, NeuralNetworks 3 (1990) 109}118.About the Author*CHAO KAN received the B.S. <strong>and</strong> M.S degrees in Electrical Engineering from Nanjing University of Posts <strong>and</strong>Telecommunications, Nanjing, China, in 1988 <strong>and</strong> 1991, respectively. He obtained a Ph.D. in Electrical Engineering from SouthernMethodist University, Dallas, TX, in 2000.He joined the Research <strong>and</strong> Development group at Raytheon Systems Company in 1995, where he worked as a research engineer. Hehas been <strong>with</strong> Alcatel USA since 1999 as a research scientist in the Corporate Research Center. His current research interests includecomputer vision, image processing, wavelet, multimedia system <strong>and</strong> Internet network management.About the Author*MANDYAM D. SRINATH received the B.Sc. degree from the University of Mysore, India, in 1954, the Diploma inElectrical Technology from the Indian Institute of Science, Bangalore, India in 1957, <strong>and</strong> the M.S. <strong>and</strong> Ph.D. degrees in ElectricalEngineering from the University of Illinois, Urbana, in 1959 <strong>and</strong> 1962, respectively.He has been on the Electrical Engineering faculties at the University of Kansas, Lawrence <strong>and</strong> the Indian Institute of Science,Bangalore, India. He is currently Professor of Electrical Engineering at Southern Methodist University, Dallas, TX, where he has beensince 1967.He has published numerous papers in signal <strong>and</strong> image processing, control <strong>and</strong> estimation theory. He is principal author of the book`Introduction to Statistical Signal Processing <strong>with</strong> Applicationsa, (<strong>with</strong> P.K. Rajasekaran <strong>and</strong> R. Viswanathan), Prentice-Hall, 1996 <strong>and</strong>co-author of `Continuous <strong>and</strong> Discrete Signals <strong>and</strong> Systemsa, (<strong>with</strong> S. Soliman), Prentice-Hall, 1990, 1998. His current research interestsare in adaptive "lters, image processing, <strong>and</strong> video data compression.Dr. Srinath is Senior Member of the Institution of Electrical <strong>and</strong> Electronic Engineers <strong>and</strong> a Registered Professional Engineer inTexas. He is an Associate editor of Pattern Recognition.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!