word boundary- hypothesisation in hindi speech - Speech and ...

More documents

Recommendations

Info

purpose of the symbol-to-text conversion system is to make the symbol string readable, by providing the missing word bou,ndaries. The text can be further corrected, if necessary, using the higher level knowledge sniirces such as syntax and semantics to make it more meaningful. In summary, the following are the advantages of the word boundary hypothesisation: 1. The complexity of lexical matching involved in large vocabulary speech recognition can be significantly reduced. 2. Unknown words can be handled. 3. If most of the word boundaries can be hypothesised, a useful speech-to-text conversion system can be developed, with only a speech signal-to-symbol converter and a word boundary hypothesiser. It is interesting to note that a meaningful text with word boundaries can be read easily, even with some errors in characters and in word boundaries (see Fig.l.1 for illustration). Thus word boundary hypothesisation plays a crucial role in producing a readable output from a speech-to-text conversion system. But continuous speech does not contain any direct clues, such as pauses, to word boundaries. However, it is interesting to note that there are several language features which can be exploited for hypothesising word boundaries. Since the original input is speech signal, one can also exploit speech related clues for word boundary hypothesisation. The objective of this thesis is to establish the significance of word boundary hypothesisation in speech recognition and to demonstrate that language and speech related clues do exist, which can be effectively used to hypothesise word boundaries. It is interesting to note that even a partial success in word boundary hypothesisation using these clues would generate a text which is significantly better than a text without word boundaries, from a readability point of view. Moreover, such a text with a few word
alounddhelongtableheagsnoddedenunisonyeswereedthuculturemustshangepropessors shoultperewardedeccordin totheerdeachinguffectivenessnotportheirfmeanrezearchin sempatheti 'brationmyow 7 f headnoddeduponddown'ustlikeatherwiseheedsthenodging continoedig incedfultivelyatmynaighbourendcaught b ementheactobmenitoringmaown degreeoffinceritynotsomuchaseripplearashedowanhis~~ressiongetrayedalylackof convictionbhiswasseriousstuff alound dhe long table heags nodded en unison yes we abreed thu culture must shange propessors shoult pe rewarded eccording to theer deaching uffectiveness not por their fame an rezearch in sempathetic yibration my owl head nodded up ond down just like ather wise heeds the nodging continoed i glinced fultively at my naighbour end caught hem en the act ob menitoring ma own degree of fincerity not so much as e ripple ar a shedow an his expression getrayed aly lack of conviction bhis was serious stuff Around the long table heads nodded in unison. Yes we agreed the culture must change. Professors should be rewarded according to their teaching effectiveness not for their fame in research. In s mpathetic vibration my own head bobbed up and down just like other wise i eads. The nodding continued. I glanced furtively at my neighbour and caught him in the act of monitoring my own degree of sincerity. Not so much as a ripple or a shadow in his expression betrayed any lack of conviction. This was serious stuff. Fig 1.1 An illustration of the improvement in the readability of a text due to word boundaries. In l.l(a), a text with nearly 50% of the words in error(but only 10% of letters in error)is shown without any word boundaries. The same text is shown in l.l(b) with word boundaries. It can be seen that the text in (b) is easier to read compared to the text in (a). The actual text is also shown in 1.1 (c) .
Page 1 and 2: Dr. €3. YEGNANARAYANA Professor D
Page 3 and 4: ACKNOWLEDGEMENTS I thank my researc
Page 5 and 6: -3 SIGNIFICANCE OF WORD BOUNDARIES
Page 7 and 8: 6.6 Location of word boundaries fro
Page 9 and 10: ABSTRACT This thesis addresses the
Page 11: \ known. The lexical analyser can n
Page 15 and 16: word boundaries into a text with wo
Page 17 and 18: words. The idea is to spot symbol s
Page 19 and 20: F1 position at a vowel-consonant bo
Page 21 and 22: -a2 A REVIEW OF THE STUDIES ON, WOR
Page 23 and 24: 1. When a midclass representation w
Page 25 and 26: particular interest is the performa
Page 27 and 28: was found that at broadclass level,
Page 29 and 30: in speaking rate indicate phrase bo
Page 31 and 32: y an estimator function E from the
Page 33 and 34: studies, the following issues are i
Page 35 and 36: 2. The effect of word boundary info
Page 37: * This selection can be done in man
Page 40 and 41: me : me:ra: metra: na: me:ra: na:m
Page 42 and 43: No. of alternatives 0-10 10-100 100
Page 44 and 45: 3.3.2 Results of lexical analysk wi
Page 46 and 47: had to be restricted to the above r
Page 48 and 49: Match Sentence nunber cost 1 2 3 4
Page 50 and 51: Match Sentence nuher cost 1 2 3 4 5
Page 52 and 53: 6 10 Number of 10 a1 ternate word s
Page 54 and 55: lexlcal analysis Mismatch cost Fig.
Page 56 and 57: had all word boundaries, is plotted
Page 58 and 59: Watch Sentence number cost 1 2 3 4
Page 60 and 61: Time lor lexical analysis 0 1 2 3 4
Page 62 and 63:
Match Sentence tunter cost 1 2 3 4
Page 64 and 65:
Time lor lexical analysis 1 0 1 2 3
Page 66 and 67:
speech recognition process. On the
Page 68 and 69:
proposed clues are based on the obs
Page 70 and 71:
Case Markers: Pronouns : ka:, ki:,
Page 72 and 73:
form. The above Hindi text was corr
Page 74 and 75:
that if there are some substitution
Page 76 and 77:
These measures were defined as foll
Page 78 and 79:
Case Conjunc- Pronouns Verb Aux . A
Page 80 and 81:
- I-' X Error in input text X Error
Page 82 and 83:
% Error Fig.4.4 Results of word bou
Page 84 and 85:
% Error Fig.4.5 A comparison of the
Page 86 and 87:
0 I 1 I 0 10 20 30 40 50 60 % Error
Page 88 and 89:
60 4 0 No. of subsentences No. of s
Page 90 and 91:
LC1: A Hindi word can end either in
Page 92 and 93:
Fig.4.10 along with the results for
Page 94 and 95:
oundaries spotted) against system s
Page 96 and 97:
from other words, such as 'national
Page 98 and 99:
dictionary contained nearly 30000 w
Page 100 and 101:
Vowel sequences Consonant sequences
Page 102 and 103:
VC' were considered. It can be obse
Page 104 and 105:
percentage of incorrect hypotheses
Page 106 and 107:
% Error in input text 0 10 20 30 40
Page 108 and 109:
% Error Fig.5.1 Results of word bou
Page 110 and 111:
hypotheses produced by the lexical
Page 112 and 113:
transformed into other nonword-inte
Page 114 and 115:
WB hyp. errors 300 - 200 - ( v+> 10
Page 116 and 117:
number of incorrect hypotheses will
Page 118 and 119:
I--- Correctness 60 8o lmprovement
Page 120 and 121:
% Error in input text 10 20 30 40 5
Page 122 and 123:
will be smaller and hence S and P(y
Page 124 and 125:
word boundary is within the consona
Page 126 and 127:
clues except for clues of types CV'
Page 128 and 129:
Pauses in speech, though small in n
Page 130 and 131:
Several studies on English speech s
Page 132 and 133:
Speaker VB hypotheses at various th
Page 134 and 135:
Speaker UF hypotheses at various th
Page 136 and 137:
Speaker UF hypotheses at various th
Page 138 and 139:
Fig.6.1 The plot of pitch frequency
Page 140 and 141:
From the results, it can be seen th
Page 142 and 143:
speakers, with the incorrect hypoth
Page 144 and 145:
durational effects. One possible ex
Page 146 and 147:
From these results, it can be seen
Page 148 and 149:
shown that the prosodic features of
Page 150 and 151:
chapter 7 WORD BOUNDARY CLUES BASED
Page 152 and 153:
consonants. Thus one can consider o
Page 154 and 155:
Speaker VB hypotheses(VB:&) at vari
Page 156 and 157:
Speaker UF hypotheses(UF:UI) at var
Page 158:
Speaker UB hypotheses(UB:B) at vari
Page 161 and 162:
high speaking rates. This is also b
Page 163 and 164:
vowels preceding word boundaries, i
Page 165 and 166:
word-final vowels than word-interna
Page 167 and 168:
8.1 Introduction PERFORMANCE OF WOR
Page 169 and 170:
Match Sentence nunber, cost 1 2 3 4
Page 171 and 172:
75 Reduction in the time for lexica
Page 173 and 174:
The reduction in the lexical analys
Page 175 and 176:
Hatch Sentence nuher cost 1 2 3 4 5
Page 177 and 178:
Chapter 9 SUMMARY AND CONCLUSIONS I
Page 179 and 180:
In the last study (described in cha
Page 181 and 182:
the language and lexical clues are
Page 183 and 184:
REFERENCES Baken R.J. and Daniloff
Page 185 and 186:
Cutler A. and Butterfield S. (199lb
Page 187 and 188:
Prakash M., Ramana Rao G,V., Chandr
Page 189 and 190:
Zelenski R. and Class F. (1983), A
show all

word boundary- hypothesisation in hindi speech - Speech and ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?