13.07.2015 Views

Proceedings Fonetik 2009 - Institutionen för lingvistik

Proceedings Fonetik 2009 - Institutionen för lingvistik

Proceedings Fonetik 2009 - Institutionen för lingvistik

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Proceedings</strong>, FOETIK <strong>2009</strong>, Dept. of Linguistics, Stockholm Universitymeasure of the uncertainty in prediction, andthus forms an intuitive measure for ourpurpose. It is always between 0 and 1 and socomparisons between different cross-modalclusterings is easy. 1 indicates very highuncertainty while 0 indicates one-to-onemapping between corresponding clusters in thetwo modalities.Experiments and ResultsThe MOCHA-TIMIT database (Wrench, 2001)was used to perform the experiments. The dataconsists of simultaneous measurements ofacoustic and articulatory data for a femalespeaker. The articulatory data consisted of 14channels, which included the X and Y-axispositions of EMA coils on 7 articulators, theLower Jaw (LJ), Upper Lip (UL), Lower Lip(LL), Tongue Tip (TT), Tongue Body (TB),Tongue Dorsum (TD) and Velum (V). Onlyvowels were considered for this study and theacoustic space was represented by the first 5formants, obtained from 25 ms acousticwindows shifted by 10 ms. The articulatorydata was low-pass fitered and down-sampled inorder to correspond with acoustic data rate. Theuncertainty (U) in clustering was estimatedusing Equation 3 for the British vowels, namely/ʊ, æ, e, ɒ, ɑ:, u:, ɜ:ʳ, ɔ:, ʌ, ɩ:, ə, ɘ/. Thearticulatory data was first clustered for all thearticulatory channels and then was clusteredindividually for each of the 7 articulators.Fig. 3 shows the clusters in both theacoustic and articulatory space for the vowel/e/. We can see that data points correspondingto one cluster in the acoustic space (F1-F2formant space) correspond to more than onecluster in the articulatory space. The ellipses,which correspond to initial clusters are replacedby different clustering labels estimated by theMCMAP algorithm. So though the acousticfeatures had more than one cluster in the firstestimate, after cross-modal clustering, all theinstances are assigned to a single cluster.Fig. 4 shows the correspondences betweenacoustic clusters and the LJ for the vowel /ə/.We can see that the uncertainty is less for someof the clusters, while it is higher for someothers. Fig. 5 shows the comparative measuresof overall the uncertainty (over all thearticulators), of the articulatory clusterscorresponding to each one of the acousticclusters for the different vowels tested. Fig 6.shows the correspondence uncertainty ofindividual articulators.UncertaintyFigure 5. The figure shows the overalluncertainty (for the whole articulatoryconfiguration) for the British vowels.Uncertainty0.90.80.70.60.50.40.30.20.106543210Overall Uncertainty for the Vowelsʊ æ e ɒ ɑ: u: ɜ:ʳ ɔ: ʌ ɩ: ə ɘVow elsUncertainty for Individual Articulatorsʊ æ e ɒ ɑ: u: ɜ:ʳ ɔ: ʌ ɩ: ə ɘFigure 6. The figure shows the uncertainty forindividual articulators for the British vowels.DiscussionFrom Fig. 5 it is clear that the shorter vowelsseem to have more uncertainty than longervowels which is intuitive. The higher uncertaintyis seen for the short vowels /e/ and /ə/,while there is almost no uncertainty for the longvowels /ɑ:/ and /ɩ:/. The overall uncertainty forthe entire configuration is usually around thelowest uncertainty for a single articulator. Thisis intuitive, and shows that even though certainarticulator correspondences are uncertain, thecorrespondences are more certain for the overallconfiguration. When the uncertainty for individualarticulators is observed, then it is apparentthat the velum has a high uncertainty ofmore than 0.6 for all the vowels. This is due tothe fact that nasalization is not observable inthe formants very easily. So even though differentclusters are formed in the articulatory space,they are seen in the same cluster in the acousticspace. The uncertainty is much less in the lowerlip correspondence for the long vowels /ɑ:/, /u:/and /ɜ:ʳ/ while it is high for /ʊ/ and /e/. The TDshows lower uncertainty for the back vowels/u:/ and /ɔ:/. The uncertainty for TD is higherVTDTBTTLLULLJ205

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!