19.11.2012 Views

Best Practices for Speech Corpora in Linguistic Research Workshop ...

Best Practices for Speech Corpora in Linguistic Research Workshop ...

Best Practices for Speech Corpora in Linguistic Research Workshop ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

select m.label as phoneme, k.label as canonic, o.label as<br />

word, count(f.f1), avg(f.f1)::<strong>in</strong>t as f1, avg(f.f2)::<strong>in</strong>t as f2<br />

from session ses<br />

jo<strong>in</strong> signalfile sig on ses.session = substr<strong>in</strong>g(sig.filename, 3, 9)<br />

jo<strong>in</strong> segment m on sig.id = m.signal_id and m.tier = ’MAU:’<br />

jo<strong>in</strong> segment o on m.signal_id = o.signal_id and<br />

m.ref_seg = o.ref_seg and o.tier = ’ORT:’<br />

jo<strong>in</strong> segment k on k.signal_id = o.signal_id and<br />

k.ref_seg = m.ref_seg and k.tier = ’KAN:’<br />

jo<strong>in</strong> <strong>for</strong>mant f on m.signal_id = f.signal_id and<br />

f.time between (m.beg<strong>in</strong>_seg + (m.dur_seg * 0.2)) and<br />

(m.beg<strong>in</strong>_seg + (m.dur_seg * 0.8))<br />

jo<strong>in</strong> speaker spk on ses.session = spk.speaker_code and<br />

ses.project = spk.project and ses.project = ’VOYS’<br />

where m.label = ’E’<br />

group by m.label, k.label, o.label<br />

order by m.label, k.label, o.label;<br />

Figure 3: Sample SQL query to retrieve phoneme segments,<br />

their count and the average f1 and f2 values, grouped<br />

by phoneme segment label, and word<br />

select phoneme, avg(f1)::<strong>in</strong>t, avg(f2)::<strong>in</strong>t<br />

from voys_data<br />

where phoneme = ’E’<br />

group by phoneme<br />

Figure 4: Us<strong>in</strong>g a view (a predef<strong>in</strong>ed virtual table) to express<br />

the same query as <strong>in</strong> Figure 3<br />

adm<strong>in</strong>istrator). The researcher can now access the database<br />

directly from the spreadsheet or statistics software he or she<br />

is us<strong>in</strong>g, per<strong>for</strong>m the statistical analyses and display the results<br />

<strong>in</strong> text <strong>for</strong>mat or diagrams (Figure 5).<br />

1000 800 600 400 200<br />

1000 800 600 400 200<br />

i<br />

e<br />

I<br />

all cities (F)<br />

}<br />

E<br />

a<br />

V<br />

o<br />

O<br />

2500 2000 1500 1000<br />

i<br />

e<br />

I<br />

all cities (F)<br />

}<br />

E<br />

a<br />

V<br />

o<br />

O<br />

2500 2000 1500 1000<br />

1000 800 600 400 200<br />

1000 800 600 400 200<br />

i<br />

all cities (M)<br />

I<br />

e<br />

E<br />

}<br />

a<br />

V<br />

o<br />

O<br />

2500 2000 1500 1000<br />

i<br />

all cities (M)<br />

I<br />

e<br />

E<br />

}<br />

a<br />

V<br />

o<br />

O<br />

2500 2000 1500 1000<br />

Figure 5: Formant charts <strong>for</strong> Scottish English vowels from<br />

the VOYS speech database<br />

Us<strong>in</strong>g this workflow and the global corpus data model, similar<br />

analyses or analyses of other corpora may now be per<strong>for</strong>med<br />

with little ef<strong>for</strong>t and can thus be used <strong>in</strong> education.<br />

4. Case study 2: A perception experiment on<br />

regional variants of sounds<br />

Perception experiments are an essential part of phonetic,<br />

l<strong>in</strong>guistic and psychological research. Most such exper-<br />

53<br />

iments are now per<strong>for</strong>med us<strong>in</strong>g a computer, and some<br />

speech process<strong>in</strong>g tools, e.g. Praat, directly support per<strong>for</strong>m<strong>in</strong>g<br />

perception experiments. However, with standalone<br />

software, per<strong>for</strong>m<strong>in</strong>g an experiment requires that the software<br />

is <strong>in</strong>stalled on every computer on which the experiment<br />

will run. Web-based onl<strong>in</strong>e experiments overcome<br />

this limitation and provide access to potentially large<br />

groups of participants (Reips, 2002).<br />

Currently, only a few tools or services exist that allow onl<strong>in</strong>e<br />

experiments with audio. Examples are WebExp (Keller<br />

et al., 2009) and Percy (Draxler, 2011). WebExp is a flexible<br />

and powerful onl<strong>in</strong>e experiment software that uses Java<br />

applets <strong>for</strong> media display, and which stores its result data <strong>in</strong><br />

XML files. Percy is based on HTML5 and stores its data <strong>in</strong><br />

a relational database on the server.<br />

4.1. Perception experiments <strong>in</strong> the workflow<br />

In pr<strong>in</strong>ciple, a perception experiment is not very different<br />

from an annotation task – speech material is presented and<br />

the participant has to enter his or her judgment. Hence, the<br />

global corpus data model also covers onl<strong>in</strong>e perception experiments.<br />

Experiment results are simply considered as yet<br />

another annotation tier. This allows the same data retrieval<br />

mechanism to be used <strong>for</strong> experiment data, which greatly<br />

simplifies further process<strong>in</strong>g.<br />

4.2. Runn<strong>in</strong>g the experiment<br />

A recent perception experiment on regional variation uses<br />

the s<strong>in</strong>gle digit items recorded <strong>in</strong> the Ph@ttSessionz<br />

project. Participants were asked to judge whether a given<br />

phoneme <strong>in</strong> a digit has certa<strong>in</strong> properties, e.g. whether the<br />

<strong>in</strong>itial ”s” <strong>in</strong> the word ”sieben” was voiced or voiceless.<br />

The question was <strong>for</strong>mulated <strong>in</strong> colloquial terms so that<br />

non-experts could participate <strong>in</strong> the experiment.<br />

The experiment consists of three steps: the participant registers<br />

and provides some personal and context <strong>in</strong><strong>for</strong>mation.<br />

Dur<strong>in</strong>g the experiment he or she listens to the recorded audio<br />

and enters a judgment, and when all items are done, the<br />

experiment displays a map with the geographic locations<br />

where the audio files were recorded (Figure 6).<br />

4.3. Statistical analysis<br />

With support from the statistics lab of LMU University, the<br />

experiment <strong>in</strong>put data was analysed us<strong>in</strong>g mixed-models<br />

provided by the R software. Results show that a) sounds<br />

effectively differ from one region to the other, and b) that<br />

the perception of sound difference depends on the regional<br />

background of the listener.<br />

5. Discussion<br />

A global corpus model is a suitable tool to support the<br />

workflow <strong>in</strong> phonetic and l<strong>in</strong>guistic research, development,<br />

and education.<br />

However, serious problems rema<strong>in</strong>. The most important are<br />

1. Miss<strong>in</strong>g data <strong>in</strong> the database may lead to broken workflows<br />

or <strong>in</strong>consistent or unexpected results.<br />

2. New tools and services may not fit <strong>in</strong>to the global corpus<br />

model.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!