word boundary- hypothesisation in hindi speech - Speech and ...
word boundary- hypothesisation in hindi speech - Speech and ...
word boundary- hypothesisation in hindi speech - Speech and ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
purpose of the symbol-to-text conversion system is to make the symbol str<strong>in</strong>g readable,<br />
by provid<strong>in</strong>g the miss<strong>in</strong>g <strong>word</strong> bou,ndaries. The text can be further corrected, if<br />
necessary, us<strong>in</strong>g the higher level knowledge sniirces such as syntax <strong>and</strong> semantics to<br />
make it more mean<strong>in</strong>gful.<br />
In summary, the follow<strong>in</strong>g are the advantages of the <strong>word</strong> <strong>boundary</strong><br />
<strong>hypothesisation</strong>:<br />
1. The complexity of lexical match<strong>in</strong>g <strong>in</strong>volved <strong>in</strong> large vocabulary <strong>speech</strong> recognition<br />
can be significantly reduced.<br />
2. Unknown <strong>word</strong>s can be h<strong>and</strong>led.<br />
3. If most of the <strong>word</strong> boundaries can be hypothesised, a useful <strong>speech</strong>-to-text<br />
conversion system can be developed, with only a <strong>speech</strong> signal-to-symbol converter <strong>and</strong><br />
a <strong>word</strong> <strong>boundary</strong> hypothesiser.<br />
It is <strong>in</strong>terest<strong>in</strong>g to note that a mean<strong>in</strong>gful text with <strong>word</strong> boundaries can be read<br />
easily, even with some errors <strong>in</strong> characters <strong>and</strong> <strong>in</strong> <strong>word</strong> boundaries (see Fig.l.1 for<br />
illustration). Thus <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong> plays a crucial role <strong>in</strong> produc<strong>in</strong>g a<br />
readable output from a <strong>speech</strong>-to-text conversion system. But cont<strong>in</strong>uous <strong>speech</strong> does<br />
not conta<strong>in</strong> any direct clues, such as pauses, to <strong>word</strong> boundaries. However, it is<br />
<strong>in</strong>terest<strong>in</strong>g to note that there are several language features which can be exploited for<br />
hypothesis<strong>in</strong>g <strong>word</strong> boundaries. S<strong>in</strong>ce the orig<strong>in</strong>al <strong>in</strong>put is <strong>speech</strong> signal, one can also<br />
exploit <strong>speech</strong> related clues for <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong>.<br />
The objective of this thesis is to establish the significance of <strong>word</strong> <strong>boundary</strong><br />
<strong>hypothesisation</strong> <strong>in</strong> <strong>speech</strong> recognition <strong>and</strong> to demonstrate that language <strong>and</strong> <strong>speech</strong><br />
related clues do exist, which can be effectively used to hypothesise <strong>word</strong> boundaries. It<br />
is <strong>in</strong>terest<strong>in</strong>g to note that even a partial success <strong>in</strong> <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong> us<strong>in</strong>g<br />
these clues would generate a text which is significantly better than a text without <strong>word</strong><br />
boundaries, from a readability po<strong>in</strong>t of view. Moreover, such a text with a few <strong>word</strong>