01.03.2013 Views

word boundary- hypothesisation in hindi speech - Speech and ...

word boundary- hypothesisation in hindi speech - Speech and ...

word boundary- hypothesisation in hindi speech - Speech and ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

express the performance, (i) the number of alternate <strong>word</strong> sequences produced for a<br />

given <strong>in</strong>put phonemi sequence, <strong>and</strong> (ii) the time spent for match<strong>in</strong>g. The study was<br />

done <strong>in</strong> two parts. The first part is for the case when the <strong>in</strong>put phoneme sequence<br />

conta<strong>in</strong>ed no errors. Hence only exact match<strong>in</strong>g was used <strong>in</strong> the dictionary match. The<br />

second part of the study is on the performance of the lexical match when the <strong>in</strong>put was<br />

assumed to conta<strong>in</strong> errors likely <strong>in</strong> <strong>speech</strong> signal-to-symbol conversion. Hence<br />

approximate str<strong>in</strong>g match<strong>in</strong>g was used <strong>in</strong> the lexical match.<br />

3.3.1 Results of lexical analysis with exact match<strong>in</strong>g<br />

The lexical analyser program was run with an <strong>in</strong>put text conta<strong>in</strong><strong>in</strong>g 100<br />

sentences. The sentences were of vary<strong>in</strong>g lengths <strong>and</strong> on average conta<strong>in</strong>ed 12 to 13<br />

<strong>word</strong>s. The results of the lexical match were ordered as per the number of alternative<br />

<strong>word</strong> str<strong>in</strong>gs <strong>and</strong> are shown <strong>in</strong> Table - 3.1.<br />

From the results it can be observed that a large number of sentences (64 out of<br />

100) had less than 10 alternate <strong>word</strong> sequences. Only a small fraction (3 oiit of 100)<br />

had 1000 or more alternate <strong>word</strong> str<strong>in</strong>gs. The average number of alternate <strong>word</strong> str<strong>in</strong>gs<br />

for a sentence was about 120. Three sentences had only a s<strong>in</strong>gle <strong>word</strong> str<strong>in</strong>g match<strong>in</strong>g<br />

them. The highest number of alternate <strong>word</strong> str<strong>in</strong>gs match<strong>in</strong>g any sentence were 2448.<br />

These results were also used to study the effect of the length of the sentence<br />

(both <strong>in</strong> terms of number of <strong>word</strong>s <strong>and</strong> number of phonemes) on the number of <strong>word</strong><br />

sequences match<strong>in</strong>g the sentence. Table - 3.2(a) <strong>and</strong> Table - 3.2(b) show the results of<br />

this study. It can be observed that <strong>in</strong> general longer sentences have more alternate<br />

<strong>word</strong> str<strong>in</strong>gs match<strong>in</strong>g them. But length alone does not determ<strong>in</strong>e the number of<br />

alternatives s<strong>in</strong>ce sentences of same length still had widely different number of<br />

match<strong>in</strong>g <strong>word</strong> sequences. For example, <strong>in</strong> our data three 'sentences conta<strong>in</strong>ed 24<br />

<strong>word</strong>s but the number of <strong>word</strong> sequences for them were 24,288 <strong>and</strong> 2488.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!