10.07.2015 Views

Optical Character Recognition of Amharic Documents - CVIT - IIIT ...

Optical Character Recognition of Amharic Documents - CVIT - IIIT ...

Optical Character Recognition of Amharic Documents - CVIT - IIIT ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

ynandmanner.ycnnc n= L= sare determined in the following[ SinS( n−1)(n)nn− ytnS−1n−1yn](8)Since the principal purpose <strong>of</strong>classification is discrimination, thediscriminant vector subspace <strong>of</strong>fersconsiderable potential as feature extractiontransformation. Hence, the use <strong>of</strong> lineardiscriminant features for each pair <strong>of</strong>classes in each component classification isan attractive solution to design betterclassifiers.However, the problem with thisapproach is that the classification becomesslower, because LDA extracts features foreach pair-wise class. This requires stackspace for storing N ( N −1)/ 2 transformationmatrices. For instance, for dataset with 200classes LDA generates around 20 thousandtransformation matrices for each pair and isvery expensive for large class problems.Hence, we propose a two-stage featureextraction scheme; PCA followed by LDAfor optimal discriminant feature extraction.We first reduce the feature dimension usingPCA and then run LDA on the reducedlower dimensional space to extract the mostdiscriminant feature vector. This reduces toa great extent the storage andcomputational complexity, while enhancingthe performance <strong>of</strong> SVM-based decisiondirected acyclic graph (DDAG) classifier.V. CLASSIFICATIONTraining and testing are the two basicphases <strong>of</strong> any pattern classification problem.During training phase, the classifier learnsthe association between samples and theirlabels from labeled samples. The testingphase involves analysis <strong>of</strong> errors in theclassification <strong>of</strong> unlabelled samples in orderto evaluate classifier’s performance. Ingeneral it is desirable to have a classifierwith minimal test error.We use Support Vector Machine (SVM)for classification task [19]. SVMs are pairwisediscriminating classifiers with theability to identify the decision boundary withmaximal margin. Maximal margin results inbetter generalization [20] which is a highlydesirable property for a classifier to performwell on a novel data set. Support vectormachines are less complex and performbetter (lower actual error) with limitedtraining data.Identification <strong>of</strong> the optimal hyper-planefor separation involves maximization <strong>of</strong> anappropriate objective function. The result <strong>of</strong>the training phase is the identification <strong>of</strong> aset <strong>of</strong> labeled support vectors and a set<strong>of</strong> coefficients αi.Support vectors are thesamples near the decision boundary. Theyhave class labels y iranging ±1. Thedecision is made from:l∑f ( x)= sgn( α y K(x , x))i=1iiix i(9)where K is the kernel function, which isdefined by:K( x,y)= φ(x)φ(y)(10)where φ : R d → H maps the data points inlower dimensions to a higher dimensionalspace H.Binary classifiers like SVM are basicallydesigned for two class classificationproblems. However, because <strong>of</strong> theexistence <strong>of</strong> a number <strong>of</strong> characters in anyscript, optical character recognition problemis inherently multi-class in nature. The field<strong>of</strong> binary classification is mature, and

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!