13.05.2014 Views

Urdu OCR - PAN Localization

Urdu OCR - PAN Localization

Urdu OCR - PAN Localization

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Training Data Population<br />

• Class based incremental training<br />

– Generating combinations of characters<br />

– Adding more classes to cover more ligatures<br />

• Ligatures for training were taken out of 46K<br />

words clean corpus<br />

• 13,628 unique ligatures<br />

• Only 3,184 ligatures are frequent<br />

• Used 30 samples per ligature<br />

14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!