word boundary- hypothesisation in hindi speech - Speech and ...
word boundary- hypothesisation in hindi speech - Speech and ...
word boundary- hypothesisation in hindi speech - Speech and ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
express the performance, (i) the number of alternate <strong>word</strong> sequences produced for a<br />
given <strong>in</strong>put phonemi sequence, <strong>and</strong> (ii) the time spent for match<strong>in</strong>g. The study was<br />
done <strong>in</strong> two parts. The first part is for the case when the <strong>in</strong>put phoneme sequence<br />
conta<strong>in</strong>ed no errors. Hence only exact match<strong>in</strong>g was used <strong>in</strong> the dictionary match. The<br />
second part of the study is on the performance of the lexical match when the <strong>in</strong>put was<br />
assumed to conta<strong>in</strong> errors likely <strong>in</strong> <strong>speech</strong> signal-to-symbol conversion. Hence<br />
approximate str<strong>in</strong>g match<strong>in</strong>g was used <strong>in</strong> the lexical match.<br />
3.3.1 Results of lexical analysis with exact match<strong>in</strong>g<br />
The lexical analyser program was run with an <strong>in</strong>put text conta<strong>in</strong><strong>in</strong>g 100<br />
sentences. The sentences were of vary<strong>in</strong>g lengths <strong>and</strong> on average conta<strong>in</strong>ed 12 to 13<br />
<strong>word</strong>s. The results of the lexical match were ordered as per the number of alternative<br />
<strong>word</strong> str<strong>in</strong>gs <strong>and</strong> are shown <strong>in</strong> Table - 3.1.<br />
From the results it can be observed that a large number of sentences (64 out of<br />
100) had less than 10 alternate <strong>word</strong> sequences. Only a small fraction (3 oiit of 100)<br />
had 1000 or more alternate <strong>word</strong> str<strong>in</strong>gs. The average number of alternate <strong>word</strong> str<strong>in</strong>gs<br />
for a sentence was about 120. Three sentences had only a s<strong>in</strong>gle <strong>word</strong> str<strong>in</strong>g match<strong>in</strong>g<br />
them. The highest number of alternate <strong>word</strong> str<strong>in</strong>gs match<strong>in</strong>g any sentence were 2448.<br />
These results were also used to study the effect of the length of the sentence<br />
(both <strong>in</strong> terms of number of <strong>word</strong>s <strong>and</strong> number of phonemes) on the number of <strong>word</strong><br />
sequences match<strong>in</strong>g the sentence. Table - 3.2(a) <strong>and</strong> Table - 3.2(b) show the results of<br />
this study. It can be observed that <strong>in</strong> general longer sentences have more alternate<br />
<strong>word</strong> str<strong>in</strong>gs match<strong>in</strong>g them. But length alone does not determ<strong>in</strong>e the number of<br />
alternatives s<strong>in</strong>ce sentences of same length still had widely different number of<br />
match<strong>in</strong>g <strong>word</strong> sequences. For example, <strong>in</strong> our data three 'sentences conta<strong>in</strong>ed 24<br />
<strong>word</strong>s but the number of <strong>word</strong> sequences for them were 24,288 <strong>and</strong> 2488.