13.07.2015 Views

Proceedings Fonetik 2009 - Institutionen för lingvistik

Proceedings Fonetik 2009 - Institutionen för lingvistik

Proceedings Fonetik 2009 - Institutionen för lingvistik

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Proceedings</strong>, FOETIK <strong>2009</strong>, Dept. of Linguistics, Stockholm UniversityCross - modal Clustering in the Acoustic - ArticulatorySpaceG. Ananthakrishnan and Daniel M. eibergCentre for Speech Technology, CSC, KTH, Stockholmagopal@kth.se, neiberg@speech.kth.seAbstractThis paper explores cross-modal clustering inthe acoustic-articulatory space. A method toimprove clustering using information frommore than one modality is presented. Formantsand the Electromagnetic Articulography measurementsare used to study corresponding clustersformed in the two modalities. A measurefor estimating the uncertainty in correspondencesbetween one cluster in the acousticspace and several clusters in the articulatoryspace is suggested.IntroductionTrying to estimate the articulatory measurementsfrom acoustic data has been of specialinterest for long time and is known as acousticto-articulatoryinversion. Though this mappingbetween the two modalities expected to be aone-to-one mapping, early research presentedsome interesting evidence showing nonuniqueness,in this mapping. Bite-block experimentshave shown that speakers are capableof producing sounds perceptually close to theintended sounds even though the jaw is fixed inan unnatural position (Gay et al., 1981). Mermelstein(1967) and Schroeder (1967) haveshown, through analytical articulatory models,that the inversion is unique to a class of areafunctions rather than a unique configuration ofthe vocal tract.With the advent of measuring techniqueslike Electromagnetic Articulography (EMA)and X-Ray Microbeam, it was possible to collectsimultaneous measurements of acousticsand articulation during continuous speech. Severalattempts have been made by researchers toperform acoustic-to-articulatory inversion byapplying machine learning techniques to theacoustic-articulatory data (Yehia et al., 1998and Kjellström and Engwall, <strong>2009</strong>). The statisticalmethods applied to the problem of mappingbrought a new dimension to the concept ofnon-uniqueness in the mapping. In the deterministiccase, one can say that if the same acousticparameters are produced by more than one articulatoryconfiguration, then the particularmapping is considered to be non-unique. It isalmost impossible to show this using real recordeddata, unless more than one articulatoryconfiguration produces exactly the same acousticparameters. However, not finding such instancesdoes not imply that non-uniquenessdoes not exist.Qin and Carreira-Perpinán (2007) proposedthat the mapping is non-unique if, for a particularacoustic cluster, the corresponding articulatorymapping may be found in more than onecluster. Evidence of non-uniqueness in certainacoustic clusters for phonemes like /ɹ/, /l/ and/w/ was presented. The study by Qin quantizedthe acoustic space using the perceptual Itakuradistance on LPC features. The articulatoryspace was clustered using a nonparametricGaussian density kernel with a fixed variance.The problem with such a definition of nonuniquenessis that one does not know what isthe optimal method and level of quantizationfor clustering the acoustic and articulatoryspaces.A later study by Neiberg et. al. (2008) arguedthat the different articulatory clustersshould not only map onto a single acoustic clusterbut should also map onto acoustic distributionswith the same parameters, for it to becalled non-unique. Using an approach based onfinding the Bhattacharya distance between thedistributions of the inverse mapping, they foundthat phonemes like /p/, /t/, /k/, /s/ and /z/ arehighly non-unique.In this study, we wish to observe how clustersin the acoustic space map onto the articulatoryspace. For every cluster in the acousticspace, we intend to find the uncertainty in findinga corresponding articulatory cluster. It mustbe noted that this uncertainty is not necessarilythe non-uniqueness in the acoustic-toarticulatorymapping. However, finding this uncertaintywould give an intuitive understanding202

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!