ITB Journal - Institute of Technology Blanchardstown

More documents

Recommendations

Info

ITB Journal Developing a Distributed Java-based Speech Recognition Engine Mr. Tony Ayres, Dr. Brian Nolan Institute of Technology Blanchardstown, Dublin, Ireland tony.ayres@itb.ie brian.nolan@itb.ie Abstract The development of speech recognition engines has traditionally been the territory of low-level development languages such as C. Until recently Java may not have been considered a candidate language for the development of such a speech engine, due to its security restrictions which limited its sound processing features. The release of the Java Sound API as part of the Java Media Framework and the subsequent integration of the Sound API into the standard Java development kit provides the necessary sound processing tools to Java to perform speech recognition. This paper documents our development of a speech recognition engine using the Java programming language. We discuss the theory of speech recognition engines using stochastic techniques such as Hidden Markov Models that we employ in our Java based implementation of speech signal processing algorithms like Fast Fourier Transform and Mel Frequency Cepstral Coefficients. Furthermore we describe our design goal and implementation of a distributed speech engine component which provides a client server approach to speech recognition. The distributed architecture allows us to deliver speech recognition technology and applications to a range of low powered devices such as PDAs and mobile phones which otherwise may not have the requisite computing power onboard to perform speech recognition . 1. Introduction In the past speech recognition engines have been developed using low level programming languages such as C or C++. Early versions of the Java programming language would not have been candidate languages for the development of such speech systems. The sandbox security model employed by the Java platform prevents rogue Java code from damaging system hardware but it also limits access to the hardware needed to perform speech recognition, namely the sound card. This problem could be overcome by using native code (i.e. C, C++, etc.) to implement functionality that Java could not and then use the Java Native Interface to link the Java and native code together. With this solution the platform independence that Java brings to software development is lost and the complexity is increased. When Sun released the Java Sound API [1] and the Java Media Framework [2] (JMF), these extensions to the Java platform allowed developers to create applications that took advantage of not only sound processing but also advanced video and 3D functionality also, moreover these extensions provide this added functionality without compromising the Java security model. With these APIs Java is not only a capable development platform for building speech recognition engines, but an engine developed in Java inherits its benefits of platform independence and object oriented development. Issue Number 9, May 2004 Page 6
ITB Journal The proliferation of the Java platform from the desktop computer to handheld devices and set- top boxes provides part of the motivation for developing a Java based speech recognition engine, given that Java 2 Micro Edition is present on over 100 million mobile phones world wide, speech recognition applications could be delivered to a large number of people users. Other motivating factors include the project from which this development has stemmed, which is entitled Voice Activated Command and Control with Speech Recognition over WiFi. In this project we are developing speech recognition software for the command and control of a remote robotic device. Our requirements for this project included a Linux based distributed speech recognition engine and a speech recognition engine for an iPaq Pocket PC PDA. Java's proficiency for operating seamlessly across multiple platforms coupled with its networking capabilities through RMI and sockets made it an obvious choice for developing our software. The remainder of this paper describes our approach to developing speech software in Java. We explain the theory behind speech recognition and map the theory to a Java based implementation and code. We also describe our design of a speech distributed recognition engine. 2. Speech Recognition and Java As has been detailed already, pure Java based speech recognition software would not have been feasible prior to the release of the Java Sound and Media APIs. Despite this, speech processing capabilities have been available to Java based applications since the release of the Java Speech API 1.0 [4] in 1998. The API offers speech recognition and synthesis functionality leveraged through a speech engine provided on the host operating system. For example, on the Windows platform the Java Speech API can take advantage of the recognition and synthesis capabilities provided by the Microsoft SAPI [5] engine. Other engines supported by JSAPI include IBM Via Voice [6], Dragon Naturally Speaking [7] and Phillips Speech SDK 2.0 [8]. JSAPI provides no speech processing functionality of its own, therefore it is not a solution to the question of Java based speech processing tools. The Sphinx [9] project taking place at Carnegie Mellon University, offers a large scale, set of speaker independent speech recognition libraries which can be used in various application scenarios. Sphinx II, is designed for fast real time speech applications and Sphinx III offers a slower more accurate speech engine. Both of these engines are written in the C programming language. Speech recognition continues to be an area of much research in the Java community, Sphinx 4 is being developed in Java, with support from Sun Microsystems. The source code for Sphinx project is open source, much of the implementation of our speech system is based on Issue Number 9, May 2004 Page 7
Page 1 and 2: ITB Journal Issue Number 9, May 200
Page 3 and 4: Editorial ITB Journal I am delighte
Page 5: ITB Journal Application of the Houg
Page 9 and 10: 3.1 Speech Input ITB Journal Figure
Page 11 and 12: ITB Journal Given a speech corpus w
Page 13 and 14: 3. Convert to Mel Spectrum ITB Jour
Page 15 and 16: L-1 ITB Journal c[n] = ∑ ln(S[i])
Page 17 and 18: ITB Journal The Java programming la
Page 19 and 20: ITB Journal Design of a Wear Test M
Page 21 and 22: a1 = depth of cut D1 = diameter of
Page 23 and 24: ITB Journal The corresponding perip
Page 25 and 26: ITB Journal Diamond Crystal Size Di
Page 27 and 28: References ITB Journal [1] Jennings
Page 29 and 30: ITB Journal This is where the area
Page 31 and 32: ITB Journal decades addressing chal
Page 33 and 34: ITB Journal identification of the s
Page 35 and 36: 6 Adaptive Architecture ITB Journal
Page 37 and 38: Intuitors ITB Journal Simulations a
Page 39 and 40: ITB Journal learners would be more
Page 41 and 42: ITB Journal A score of 21 points of
Page 43 and 44: ITB Journal Consistency of Academic
Page 45 and 46: ITB Journal Point Average (LCGPA) a
Page 47 and 48: ITB Journal ‘non-standard’ entr
Page 49 and 50: ITB Journal Table 3: Whole Group: S
Page 51 and 52: ITB Journal Secondly, in line with
Page 53 and 54: ITB Journal Koh, M.Y. and Koh, H.C.
Page 55 and 56: ITB Journal primary schools in Nort
Page 57 and 58:
ITB Journal Questionnaires were des
Page 59 and 60:
ITB Journal During interviews in 20
Page 61 and 62:
ITB Journal discuss their differenc
Page 63 and 64:
ITB Journal Speech Synthesis for PD
Page 65 and 66:
ITB Journal necessary to use the co
Page 67 and 68:
ITB Journal nested classes and thei
Page 69 and 70:
ITB Journal The timers in FreeTTS w
Page 71 and 72:
ITB Journal platform that does have
Page 73 and 74:
ITB Journal recording the two sampl
Page 75 and 76:
ITB Journal E-Business - Making the
Page 77 and 78:
ITB Journal IDC consultants predict
Page 79 and 80:
ITB Journal Over the last decade, c
Page 81 and 82:
Findings The Role of Marketing ITB
Page 83 and 84:
ITB Journal � search engines need
Page 85 and 86:
ITB Journal managers wish to reinve
Page 87 and 88:
ITB Journal One aspect of AAC is th
Page 89 and 90:
3.1 Alphabetical Layout ITB Journal
Page 91 and 92:
ITB Journal The object of the exper
Page 93 and 94:
ITB Journal All subjects agreed tha
Page 95 and 96:
ITB Journal Alphabetical (in WPM) F
Page 97 and 98:
ITB Journal English Letter Frequenc
Page 99 and 100:
ITB Journal computer numerical cont
Page 101 and 102:
ITB Journal Figure 4 Schematic of I
Page 103 and 104:
J J J J lin sp lp bs � J ext �
Page 105 and 106:
ITB Journal Nevertheless, we will c
Page 107 and 108:
ITB Journal The motor “Voltage In
Page 109 and 110:
ITB Journal We are interested in th
Page 111 and 112:
ITB Journal van Amerongen Job, Bree
Page 113 and 114:
ITB Journal entrepreneurial from th
Page 115 and 116:
ITB Journal entrepreneur and their
Page 117 and 118:
ITB Journal Empirical research carr
Page 119 and 120:
ITB Journal The study concludes tha
Page 121 and 122:
ITB Journal to expand geographicall
Page 123 and 124:
ITB Journal Many studies have discu
Page 125 and 126:
ITB Journal There is agreement abou
Page 127 and 128:
ITB Journal Roberts, E.B.& Senturia
Page 129 and 130:
ITB Journal to the decision-makers
Page 131 and 132:
ITB Journal o Making and communicat
Page 133 and 134:
ITB Journal o Scare tactics e.g.
Page 135 and 136:
ITB Journal benefits as a result of
Page 137 and 138:
ITB Journal calculate having superi
Page 139 and 140:
ITB Journal receives the benefit, w
Page 141 and 142:
ITB Journal framework does not supp
Page 143 and 144:
ITB Journal possibility that IT/IS
Page 145 and 146:
ITB Journal Architecture and develo
Page 147 and 148:
ITB Journal content (HTML, WML, SVG
Page 149 and 150:
ITB Journal SAGE deals with one of
Page 151 and 152:
ITB Journal SRF (Service Route Find
Page 153 and 154:
ITB Journal supplied conversion too
Page 155 and 156:
ITB Journal Dublin City Walking Nea
Page 157 and 158:
ITB Journal Entire New York Search
Page 159 and 160:
REFERENCES ITB Journal Loehnert E.,
Page 161 and 162:
ITB Journal equivalent real-world c
Page 163 and 164:
3.1 Types of Camera Angles ITB Jour
Page 165 and 166:
3.4 Screen Direction ITB Journal Sc
Page 167 and 168:
ITB Journal of the principles upon
Page 169 and 170:
ITB Journal Mascelli, J. V. (1965).
Page 171 and 172:
ITB Journal Another term principal
Page 173 and 174:
ITB Journal Synchronous machines ar
Page 175 and 176:
ITB Journal board 8-channel 12-bit
Page 177 and 178:
6. Distortion ITB Journal Figure 6
Page 179 and 180:
ITB Journal [14] R. Pena, J.C. Clar
Page 181 and 182:
ITB Journal Although these methodol
Page 183 and 184:
ITB Journal complier or the interpr
Page 185 and 186:
ITB Journal XP, like the other agil
Page 187 and 188:
ITB Journal It is hoped that the to
Page 189 and 190:
ITB Journal loss of equipment perfo
Page 191 and 192:
ITB Journal The techniques used to
Page 193 and 194:
Run Order Conveyo r Speed Preheat T
Page 195 and 196:
Recommendation ITB Journal The adeq
Page 197 and 198:
ITB Journal Application of the Houg
Page 199 and 200:
ITB Journal common point. Hence eac
Page 201 and 202:
3. Experiments and Results ITB Jour
Page 203 and 204:
ITB Journal Hence these problems ca
Page 205 and 206:
ITB Journal Investigation Into The
Page 207 and 208:
ITB Journal A test was developed by
Page 209 and 210:
Unreliability, F(t) 99.90 90.00 50.
Page 211 and 212:
ITB Journal Equation 2 and 3 can be
Page 213 and 214:
ITB Journal Emotion Authentication:
Page 215 and 216:
ITB Journal � Amplitude variation
Page 217 and 218:
ITB Journal Figure 1: JitGen4 Struc
Page 219 and 220:
ITB Journal Although we are of the
Page 221 and 222:
Voice data Pseudorandom Seed genera
Page 223 and 224:
ITB Journal Neural Networks for Rea
Page 225 and 226:
ITB Journal (GA) [Buckland 02]. Thi
Page 227 and 228:
2.1 Stage One ITB Journal Figure 2.
Page 229 and 230:
ITB Journal Neural Network Informat
Page 231 and 232:
ITB Journal A New Integrated Style
Page 233 and 234:
ITB Journal delivered in engineerin
Page 235 and 236:
ITB Journal would help determine th
Page 237 and 238:
ITB Journal Figure 3: CBL of ANOVA
Page 239 and 240:
ITB Journal Design Study of a Heavy
Page 241 and 242:
ITB Journal strain for each element
Page 243 and 244:
Actual modifications Yes ? NO New D
Page 245 and 246:
5. Calibration procedure ITB Journa
Page 247 and 248:
ITB Journal Measurement of the Freq
Page 249 and 250:
ITB Journal narrow region (0.15 µm
Page 251 and 252:
ITB Journal long t1 can prolong t2
Page 253 and 254:
ITB Journal The techniques and meas
Page 255 and 256:
ITB Journal t1 parameter is an impo
Page 257 and 258:
ITB Journal [6] Schena J, Thompson
Page 259 and 260:
ITB Journal political goals of the
Page 261 and 262:
ITB Journal important to note that
Page 263 and 264:
ITB Journal Many gravitate to the o
Page 265 and 266:
ITB Journal environmental and socia
Page 267 and 268:
ITB Journal more participatory demo
Page 269 and 270:
ITB Journal Ravetz, J. (2000) City
Page 271 and 272:
ITB Journal to the MN via the two n
Page 273 and 274:
SERVER MN START_HANDOVER (Session I
Page 275 and 276:
ITB Journal Our future work will fo
Page 277 and 278:
ITB Journal ethical guidelines are
Page 279 and 280:
� Social Responsibility ITB Journ
Page 281 and 282:
III Unaccompanied Minors ITB Journa
Page 283 and 284:
ITB Journal presence, for example,
Page 285 and 286:
ITB Journal Whether research simply
Page 287 and 288:
ITB Journal These platforms of embe
Page 289 and 290:
to temperature sensor ITB Journal F
Page 291 and 292:
Limitations of TINI ITB Journal The
Page 293 and 294:
ITB Journal Developing Real-Time Mu
Page 295 and 296:
ITB Journal o Real-time Voice/Video
Page 297 and 298:
From Analysis to Design Setup/Teard
Page 299 and 300:
InstantMesManager - Message current
Page 301 and 302:
References ITB Journal [1] B. Campb
Page 303 and 304:
ITB Journal a global inter-network
Page 305 and 306:
ITB Journal while also providing mo
Page 307 and 308:
3.3 Embedded System Support ITB Jou
Page 309 and 310:
ITB Journal 5.1 Communication among
Page 311 and 312:
Code-Listing1: Sensor API code snip
Page 313 and 314:
ITB Journal Implementing Test Patte
Page 315 and 316:
ITB Journal A transitional and conv
Page 317 and 318:
ITB Journal between small voice pac
Page 319 and 320:
ITB Journal channels have been esta
Page 321 and 322:
ITB Journal routing intelligence in
Page 323 and 324:
ITB Journal customers for translati
Page 325 and 326:
4 Technologies ITB Journal Various
Page 327 and 328:
ITB Journal 5 Case 1 -
Page 329 and 330:
ITB Journal options { OUTPUT_DIRECT
Page 331 and 332:
5.3 Using the Parser ITB Journal Th
Page 333 and 334:
6.2.1 XSLT Directives and Processin
Page 335:
ITB Journal http://www.itb.ie Issue
show all

ITB Journal - Institute of Technology Blanchardstown

Create successful ePaper yourself

Delete template?

Save as template?