12.07.2015 Views

The HTK Book Steve Young Gunnar Evermann Dan Kershaw ...

The HTK Book Steve Young Gunnar Evermann Dan Kershaw ...

The HTK Book Steve Young Gunnar Evermann Dan Kershaw ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 1<strong>The</strong> Fundamentals of <strong>HTK</strong>Speech DataTranscriptionTraining ToolsRecogniserUnknown SpeechTranscription<strong>HTK</strong> is a toolkit for building Hidden Markov Models (HMMs). HMMs can be used to modelany time series and the core of <strong>HTK</strong> is similarly general-purpose. However, <strong>HTK</strong> is primarilydesigned for building HMM-based speech processing tools, in particular recognisers. Thus, much ofthe infrastructure support in <strong>HTK</strong> is dedicated to this task. As shown in the picture above, thereare two major processing stages involved. Firstly, the <strong>HTK</strong> training tools are used to estimatethe parameters of a set of HMMs using training utterances and their associated transcriptions.Secondly, unknown utterances are transcribed using the <strong>HTK</strong> recognition tools.<strong>The</strong> main body of this book is mostly concerned with the mechanics of these two processes.However, before launching into detail it is necessary to understand some of the basic principles ofHMMs. It is also helpful to have an overview of the toolkit and to have some appreciation of howtraining and recognition in <strong>HTK</strong> is organised.This first part of the book attempts to provide this information. In this chapter, the basic ideasof HMMs and their use in speech recognition are introduced. <strong>The</strong> following chapter then presents abrief overview of <strong>HTK</strong> and, for users of older versions, it highlights the main differences in version2.0 and later. Finally in this tutorial part of the book, chapter 3 describes how a HMM-basedspeech recogniser can be built using <strong>HTK</strong>. It does this by describing the construction of a simplesmall vocabulary continuous speech recogniser.<strong>The</strong> second part of the book then revisits the topics skimmed over here and discusses each indetail. This can be read in conjunction with the third and final part of the book which providesa reference manual for <strong>HTK</strong>. This includes a description of each tool, summaries of the variousparameters used to configure <strong>HTK</strong> and a list of the error messages that it generates when thingsgo wrong.Finally, note that this book is concerned only with <strong>HTK</strong> as a tool-kit. It does not provideinformation for using the <strong>HTK</strong> libraries as a programming environment.2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!