01.03.2013 Views

word boundary- hypothesisation in hindi speech - Speech and ...

word boundary- hypothesisation in hindi speech - Speech and ...

word boundary- hypothesisation in hindi speech - Speech and ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

syllables than at <strong>word</strong> boundaries preced<strong>in</strong>g strong syllables, <strong>in</strong>dicat<strong>in</strong>g that the<br />

speakers are aware of the use of strong syllables to <strong>in</strong>dicate <strong>word</strong> beg<strong>in</strong>n<strong>in</strong>gs. Thus<br />

these f<strong>in</strong>d<strong>in</strong>gs confirmed the use of strong syllables to detect <strong>word</strong> boundaries by<br />

humans <strong>and</strong> validate the use of MSS.<br />

The above studies established that prosodic knowledge can be used to<br />

hypothesise <strong>word</strong> boundaries <strong>in</strong> <strong>speech</strong>. The studies have also identified some prosodic<br />

features, such as pause, duration <strong>and</strong> pitch(FO), as possible clues to <strong>word</strong> boundaries.<br />

Our studies for H<strong>in</strong>di us<strong>in</strong>g these prosodic features for <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong><br />

are reported <strong>in</strong> chapter 6.<br />

2.3.4 Word <strong>boundary</strong> hypotheskation techniques based on acoustic-phonetic knowledge<br />

Not many <strong>word</strong> <strong>boundary</strong> detection techniques were reported <strong>in</strong> literature <strong>in</strong><br />

which the acoustic-phonetic knowledge was explicitly used. However, two techniques<br />

were reported which used spectral <strong>in</strong>formation to detect the <strong>word</strong> boundaries. Both<br />

operate directly on the <strong>speech</strong> signal <strong>and</strong> hypothesise <strong>word</strong> boundaries <strong>in</strong> it.<br />

The first technique was developed for application <strong>in</strong> a connected <strong>word</strong><br />

recognition task [Zelenski <strong>and</strong> Class 19831. It used an algorithm which was based on<br />

estimation pr<strong>in</strong>ciples. In this the <strong>in</strong>put <strong>speech</strong> signal was divided <strong>in</strong>to a sequence of<br />

w<strong>in</strong>dows. The signal <strong>in</strong> the w<strong>in</strong>dow was represented by a parameter vector x =<br />

{x1,x2, ... xL), where each of the xi represent a <strong>speech</strong> parameter such as one of the<br />

outputs of a filter bank. The <strong>word</strong> <strong>boundary</strong> <strong>hypothesisation</strong> problem was posed as one<br />

of classify<strong>in</strong>g a given w<strong>in</strong>dow <strong>in</strong>to one of the two classes: (i) class1 w<strong>in</strong>dow, conta<strong>in</strong><strong>in</strong>g<br />

a <strong>word</strong> <strong>boundary</strong>, <strong>and</strong> (ii) class2 w<strong>in</strong>dow, not conta<strong>in</strong><strong>in</strong>g a <strong>word</strong> <strong>boundary</strong>. Ideally the<br />

classifier should produce an output z, where z = 1 for w<strong>in</strong>dow class 1, <strong>and</strong> z = 0 for<br />

w<strong>in</strong>dow class 2.<br />

The target value z can be approximated by an estimation d which is computed

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!