word boundary- hypothesisation in hindi speech - Speech and ...
word boundary- hypothesisation in hindi speech - Speech and ...
word boundary- hypothesisation in hindi speech - Speech and ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
syllables than at <strong>word</strong> boundaries preced<strong>in</strong>g strong syllables, <strong>in</strong>dicat<strong>in</strong>g that the<br />
speakers are aware of the use of strong syllables to <strong>in</strong>dicate <strong>word</strong> beg<strong>in</strong>n<strong>in</strong>gs. Thus<br />
these f<strong>in</strong>d<strong>in</strong>gs confirmed the use of strong syllables to detect <strong>word</strong> boundaries by<br />
humans <strong>and</strong> validate the use of MSS.<br />
The above studies established that prosodic knowledge can be used to<br />
hypothesise <strong>word</strong> boundaries <strong>in</strong> <strong>speech</strong>. The studies have also identified some prosodic<br />
features, such as pause, duration <strong>and</strong> pitch(FO), as possible clues to <strong>word</strong> boundaries.<br />
Our studies for H<strong>in</strong>di us<strong>in</strong>g these prosodic features for <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong><br />
are reported <strong>in</strong> chapter 6.<br />
2.3.4 Word <strong>boundary</strong> hypotheskation techniques based on acoustic-phonetic knowledge<br />
Not many <strong>word</strong> <strong>boundary</strong> detection techniques were reported <strong>in</strong> literature <strong>in</strong><br />
which the acoustic-phonetic knowledge was explicitly used. However, two techniques<br />
were reported which used spectral <strong>in</strong>formation to detect the <strong>word</strong> boundaries. Both<br />
operate directly on the <strong>speech</strong> signal <strong>and</strong> hypothesise <strong>word</strong> boundaries <strong>in</strong> it.<br />
The first technique was developed for application <strong>in</strong> a connected <strong>word</strong><br />
recognition task [Zelenski <strong>and</strong> Class 19831. It used an algorithm which was based on<br />
estimation pr<strong>in</strong>ciples. In this the <strong>in</strong>put <strong>speech</strong> signal was divided <strong>in</strong>to a sequence of<br />
w<strong>in</strong>dows. The signal <strong>in</strong> the w<strong>in</strong>dow was represented by a parameter vector x =<br />
{x1,x2, ... xL), where each of the xi represent a <strong>speech</strong> parameter such as one of the<br />
outputs of a filter bank. The <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong> problem was posed as one<br />
of classify<strong>in</strong>g a given w<strong>in</strong>dow <strong>in</strong>to one of the two classes: (i) class1 w<strong>in</strong>dow, conta<strong>in</strong><strong>in</strong>g<br />
a <strong>word</strong> <strong>boundary</strong>, <strong>and</strong> (ii) class2 w<strong>in</strong>dow, not conta<strong>in</strong><strong>in</strong>g a <strong>word</strong> <strong>boundary</strong>. Ideally the<br />
classifier should produce an output z, where z = 1 for w<strong>in</strong>dow class 1, <strong>and</strong> z = 0 for<br />
w<strong>in</strong>dow class 2.<br />
The target value z can be approximated by an estimation d which is computed