Task Independent Speech Verification Using SB-MVE Trained ...

60 

60 

50 

50 

Correct rejection rate % 

40 

30 

20 

Correct rejection rate % 

40 

30 

20 

10 

10 

0 

0 5 10 15 20 25 

False rejection rate % 

0 

0 5 10 15 20 25 

False rejection rate % 

Figure 1: Performance for task independent ML-trained verification 

models. Single ML (dotted), coh ML (solid), all ML 

(dashed). 

Figure 2: Performance for task independent SB-MVE-trained 

versus ML-trained models. Single ML (dotted), single SB- 

MVE (solid), all ML (dashed). 

for the SB-MVE models increases from 74.3% to 82.3% as a 

function of increasing the false rejection rate from 0% to 25%. 

This improvement corresponds to a relative decrease in the utterance 

error rate from 8% to 37%. 

Assuming a working point of 10% false rejection rate, table 

2 gives the corresponding utterance accuracy for the different 

verification models. The corresponding relative decrease in utterance 

error rate for SB-MVE trained models is 27%. 

No verification Single ML All ML SB-MVE 

71.8 75.9 78.2 79.4 

Table 2 : Utterance accuracy after verification for 10% false 

rejection rate. 

The promising results direct us towards current/future work 

with respect to the following topics 

• Update all the verification model parameters. 

• Try out more flexible structures for the single anti-phone 

models. 

• Compare our method with the subword-based MVE 

method in [6]. 

• Investigate methods for determining optimal tresholds. 

5. Conclusions 

In this paper we have introduced string based MVE training of 

both H 0 and H 1 monophone models in a task independent utterance 

verification module. The algorithm has been tested on 

the ”time-of-day” recordings of the Norwegian part of Speech- 

Dat (II). The results were compared with corresponding results 

for different ML-trained antimodel combinations. We conclude 

that verification using SB-MVE trained models decreases the 

utterance error rate significantly for all tresholds. In addition, 

the performance was consistently better than when using the 

much more computationally complex method based on combining 

all (but the H 0) ML-trained models to form a competing 

anti-model. 

Thus our experiments confirm the results given for the other 

SB-MVE variants [5], [6]. 

6. Acknowledgements 

The work is done as a part of the BRAGE-project, which is 

organized under the language technology programme KUNSTI 

and funded by the Norwegian Research Council. 

7. References 

[1] T. Schaaf and T. Kemp, “Confidence measure for spontaneous 

speech recognition”, Proc. ICASSP-1997, pp. 875- 

878. 

[2] R. San-Segundo, B. Pellom, K. Hacioglu, W. Ward, “Confidence 

measures for spoken dialogue systems”. Proc. 

ICASSP-2001, pp. 393-396 

[3] T. Kemp and T. Schaaf, “Estimating confidence using 

word lattices.”, Proc. EuroSpeech-1997, pp. 827-830. 

[4] F. Wessel, K. Macherey and H. Ney, “A comparison of 

word graph and N-best list based confidence measures.”, 

Proc. ICASSP-2000, pp. 1587-1590. 

[5] M.G. Rahim and C-H. Lee, “String-based minimum verification 

error (SB-MVE) training for speech recognition”, 

Computer Speech and Language, Vol. 11, 1997, pp 147- 

160. 

[6] R. A Sukkar, “Subword-based minimum verification error 

(SB-MVE) training for task independent utterance verification”, 

Proc. ICASSP-1998, pp 229-232. 

[7] B. Lindberg, T.F. Johansen, N.D. Warakagoda, G. Lehtinen, 

Z. Kacic, A. Zgank, K. Elenius, G. Salvi, “A noise 

robust multilingual reference recogniser based on Speech- 

Dat (II)”, Proc. ICSLP-2000, pp. 370-373

Previous page

Next page

1

2

3

4

Task Independent Speech Verification Using SB-MVE Trained ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?