22.01.2014 Views

Task Independent Speech Verification Using SB-MVE Trained ...

Task Independent Speech Verification Using SB-MVE Trained ...

Task Independent Speech Verification Using SB-MVE Trained ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

60<br />

60<br />

50<br />

50<br />

Correct rejection rate %<br />

40<br />

30<br />

20<br />

Correct rejection rate %<br />

40<br />

30<br />

20<br />

10<br />

10<br />

0<br />

0 5 10 15 20 25<br />

False rejection rate %<br />

0<br />

0 5 10 15 20 25<br />

False rejection rate %<br />

Figure 1: Performance for task independent ML-trained verification<br />

models. Single ML (dotted), coh ML (solid), all ML<br />

(dashed).<br />

Figure 2: Performance for task independent <strong>SB</strong>-<strong>MVE</strong>-trained<br />

versus ML-trained models. Single ML (dotted), single <strong>SB</strong>-<br />

<strong>MVE</strong> (solid), all ML (dashed).<br />

for the <strong>SB</strong>-<strong>MVE</strong> models increases from 74.3% to 82.3% as a<br />

function of increasing the false rejection rate from 0% to 25%.<br />

This improvement corresponds to a relative decrease in the utterance<br />

error rate from 8% to 37%.<br />

Assuming a working point of 10% false rejection rate, table<br />

2 gives the corresponding utterance accuracy for the different<br />

verification models. The corresponding relative decrease in utterance<br />

error rate for <strong>SB</strong>-<strong>MVE</strong> trained models is 27%.<br />

No verification Single ML All ML <strong>SB</strong>-<strong>MVE</strong><br />

71.8 75.9 78.2 79.4<br />

Table 2 : Utterance accuracy after verification for 10% false<br />

rejection rate.<br />

The promising results direct us towards current/future work<br />

with respect to the following topics<br />

• Update all the verification model parameters.<br />

• Try out more flexible structures for the single anti-phone<br />

models.<br />

• Compare our method with the subword-based <strong>MVE</strong><br />

method in [6].<br />

• Investigate methods for determining optimal tresholds.<br />

5. Conclusions<br />

In this paper we have introduced string based <strong>MVE</strong> training of<br />

both H 0 and H 1 monophone models in a task independent utterance<br />

verification module. The algorithm has been tested on<br />

the ”time-of-day” recordings of the Norwegian part of <strong>Speech</strong>-<br />

Dat (II). The results were compared with corresponding results<br />

for different ML-trained antimodel combinations. We conclude<br />

that verification using <strong>SB</strong>-<strong>MVE</strong> trained models decreases the<br />

utterance error rate significantly for all tresholds. In addition,<br />

the performance was consistently better than when using the<br />

much more computationally complex method based on combining<br />

all (but the H 0) ML-trained models to form a competing<br />

anti-model.<br />

Thus our experiments confirm the results given for the other<br />

<strong>SB</strong>-<strong>MVE</strong> variants [5], [6].<br />

6. Acknowledgements<br />

The work is done as a part of the BRAGE-project, which is<br />

organized under the language technology programme KUNSTI<br />

and funded by the Norwegian Research Council.<br />

7. References<br />

[1] T. Schaaf and T. Kemp, “Confidence measure for spontaneous<br />

speech recognition”, Proc. ICASSP-1997, pp. 875-<br />

878.<br />

[2] R. San-Segundo, B. Pellom, K. Hacioglu, W. Ward, “Confidence<br />

measures for spoken dialogue systems”. Proc.<br />

ICASSP-2001, pp. 393-396<br />

[3] T. Kemp and T. Schaaf, “Estimating confidence using<br />

word lattices.”, Proc. Euro<strong>Speech</strong>-1997, pp. 827-830.<br />

[4] F. Wessel, K. Macherey and H. Ney, “A comparison of<br />

word graph and N-best list based confidence measures.”,<br />

Proc. ICASSP-2000, pp. 1587-1590.<br />

[5] M.G. Rahim and C-H. Lee, “String-based minimum verification<br />

error (<strong>SB</strong>-<strong>MVE</strong>) training for speech recognition”,<br />

Computer <strong>Speech</strong> and Language, Vol. 11, 1997, pp 147-<br />

160.<br />

[6] R. A Sukkar, “Subword-based minimum verification error<br />

(<strong>SB</strong>-<strong>MVE</strong>) training for task independent utterance verification”,<br />

Proc. ICASSP-1998, pp 229-232.<br />

[7] B. Lindberg, T.F. Johansen, N.D. Warakagoda, G. Lehtinen,<br />

Z. Kacic, A. Zgank, K. Elenius, G. Salvi, “A noise<br />

robust multilingual reference recogniser based on <strong>Speech</strong>-<br />

Dat (II)”, Proc. ICSLP-2000, pp. 370-373

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!