26.04.2013 Views

Handwritten Word Spotting in Old Manuscript Images using Shape ...

Handwritten Word Spotting in Old Manuscript Images using Shape ...

Handwritten Word Spotting in Old Manuscript Images using Shape ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The models can then be used to retrieve unlabelled images of handwritten documents given a text<br />

query<br />

Handwrit<strong>in</strong>g recognition of large vocabularies <strong>in</strong> historical documents is still a very challeng<strong>in</strong>g<br />

task. Nagy <strong>in</strong> [15] discusses the papers published <strong>in</strong> PAMI on document analysis dur<strong>in</strong>g the last<br />

20 years.<br />

A word can be represented with different k<strong>in</strong>d of features. A feature is a measurement about<br />

the object to study, and allows to reduce all the characteristics of the image to a few that preserve<br />

the ma<strong>in</strong> <strong>in</strong>formation <strong>in</strong> a more manageable size . There are three types of features: quantitative<br />

(numeric) features, qualitative (symbolic) features and structured features. Quantitative features<br />

can be discrete values (e.g.- weight, the number of computers) or <strong>in</strong>terval values (e.g.- the duration<br />

of an event). Qualitative features can be nom<strong>in</strong>al or unordered (e.g.- colour) and ord<strong>in</strong>al (e.g.sound<br />

<strong>in</strong>tensity - “quiet” or “loud”). Structured features represent relational and/or hierarchical<br />

attributes among a set of primitive patterns (e.g.- a parent node can be a generalization of children<br />

labelled “cars”, “truck” and “motorbikes”) [28].<br />

There are different ways to match words, it depends on the k<strong>in</strong>d of features that are. For<br />

example, the words can be matched directly comput<strong>in</strong>g the distance such as XOR, Euclidean<br />

Distance Mapp<strong>in</strong>g (EDM), Sum of Square Differences (SSD), SLH, Hausdorff distance, etc. The<br />

problem of these methods is that they are very sensitive to spatial variation.<br />

One of the most widely used feature comparison algorithms <strong>in</strong> handwrit<strong>in</strong>g recognition is the<br />

Dynamic Time Warp<strong>in</strong>g (DTW) [19; 9]. DTW is an algorithm for measur<strong>in</strong>g similarity between<br />

two sequences which may vary <strong>in</strong> time or speed. It has been widely used <strong>in</strong> the speech process<strong>in</strong>g,<br />

bio-<strong>in</strong>formatics and also on the on-l<strong>in</strong>e handwrit<strong>in</strong>g communities to match 1-D signals. Even though<br />

the features of the image are <strong>in</strong> general <strong>in</strong> 2-dimensions, it is possible to recast them <strong>in</strong> 1-dimension,<br />

but it is possible to loose the association between columns features of images. DTW algorithm tries<br />

to m<strong>in</strong>imize the variations between the features vectors. In general, it is a method that allows a<br />

computer to f<strong>in</strong>d a optimal match between two given sequences.<br />

In holistic approaches the image word is not segmented <strong>in</strong>to smaller parts, but are considered<br />

as a whole shape [3]. Thus, the recognition uses to be performed by a shape match<strong>in</strong>g algorithm<br />

<strong>in</strong> terms of the features computed at some key po<strong>in</strong>ts of <strong>in</strong>terest. A comparative study between<br />

a number of po<strong>in</strong>ts of <strong>in</strong>terest detectors is presented <strong>in</strong> [25]. For example corner can be detected<br />

with the Harris detector [23], but a drawback of such detector is its sensitiveness to noise.<br />

Cohesive Elastic Match<strong>in</strong>g [12] is based on zon<strong>in</strong>g, and it is possible to apply <strong>in</strong> all the text<br />

image, it is not necessary to segment the words of the text. It is a good method to compare zones<br />

of <strong>in</strong>terest (ZOI). This algorithm is <strong>in</strong>dependent of the ZOI extraction method.<br />

Hidden Markov Models (HMM) are used sometimes <strong>in</strong> word-spott<strong>in</strong>g [1] to match words <strong>in</strong>to<br />

documents, but they are usually applied <strong>in</strong> documents with a reduced vocabulary and needs a<br />

considerable learn<strong>in</strong>g stage.<br />

4. Choos<strong>in</strong>g the number of clusters<br />

There are different methods <strong>in</strong> the literature to choose the number of clusters. They can be classified<br />

<strong>in</strong> two big groups depend<strong>in</strong>g on how it is chosen the number of clusters. The first one is a manual<br />

8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!