19.11.2012 Views

Best Practices for Speech Corpora in Linguistic Research Workshop ...

Best Practices for Speech Corpora in Linguistic Research Workshop ...

Best Practices for Speech Corpora in Linguistic Research Workshop ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6. Data analysis<br />

Each turn is annotated by one of the follow<strong>in</strong>g types of<br />

tags: agr(eement), expl(anation), sugg(estion), sum(mary),<br />

q(uestion), ch(eck<strong>in</strong>g), disagr(eement), recap(itulation),<br />

ups(hot), com(ment). These are not the exhaustive list of<br />

tags that we are develop<strong>in</strong>g but here we present our analysis<br />

us<strong>in</strong>g these tags only.<br />

Figure 2: Speaker <strong>in</strong>teraction<br />

This figure, generated with Cytoscape (Smoot, Ono,<br />

Rusche<strong>in</strong>ski, Wang and Ideker, 2011), presents a graphic<br />

summary of the <strong>in</strong>teractions between speakers. The<br />

thickness of the edges conveys the number of occurrences<br />

of the pair (speaker 1, speaker 2). This representation can<br />

be understood as a probabilistic model of the meet<strong>in</strong>g:<br />

start<strong>in</strong>g from one speaker, the next speaker is chosen accord<strong>in</strong>g<br />

to the observed frequency of turn-tak<strong>in</strong>g from that<br />

speaker. The figure reveals immediately that the meet<strong>in</strong>g<br />

is dom<strong>in</strong>ated by four persons, Anne, Carl, Kate and Mary,<br />

but it is also <strong>in</strong>terest<strong>in</strong>g to note that exchanges between<br />

all the participants are symmetrical. Had fewer of Kate’s<br />

turns, <strong>for</strong> example, followed those of Mary, the arrowed<br />

edge from Mary to Kate would have been th<strong>in</strong>ner than that<br />

from Kate to Mary. Representations of this sort allow the<br />

analyst to identify features of potential <strong>in</strong>terest <strong>in</strong> the talk,<br />

some clear (e.g. whether, <strong>for</strong> example, participants from a<br />

particular discipl<strong>in</strong>e are dom<strong>in</strong>at<strong>in</strong>g the talk or the meet<strong>in</strong>g<br />

shows evidence of cross-discipl<strong>in</strong>ary exchanges), others<br />

suggestive (e.g. <strong>in</strong> the case of unequal edges, asymmetries<br />

<strong>in</strong> the relevant relationship).<br />

It is also possible to complement this speaker network by<br />

develop<strong>in</strong>g a representation that captures the nature of the<br />

responses made to the prior turn. Figure 3 below provides<br />

<strong>in</strong><strong>for</strong>mation about the nature of the speaker’s utterance,<br />

given by its tag. For each edge ‘speaker 1 to speaker<br />

2’ <strong>in</strong> the network, a correspond<strong>in</strong>g bar <strong>in</strong> the figure shows<br />

the tags that were used by speaker 2 <strong>in</strong> that turn-tak<strong>in</strong>g occurrence.<br />

For example, we know from the network (figure<br />

2) that Carl’s turns frequently follow those of Anne. The<br />

wheel below then allows us to see at a glance that <strong>in</strong> those<br />

40<br />

cases roughly 10% of the responses were agreements and a<br />

further 30% were comments, while disagreement was very<br />

rare.<br />

Figure 3: Speaker-Tag distribution<br />

Hav<strong>in</strong>g a consistent and <strong>in</strong>stantly accessible way of represent<strong>in</strong>g<br />

the <strong>in</strong>teractions between participants will enable us<br />

to compare different meet<strong>in</strong>gs and see, <strong>for</strong> example, if any<br />

pattern emerges from a status, gender and discipl<strong>in</strong>e angle.<br />

A probabilistic approach will allow us to systematise and<br />

quantify those comparisons.<br />

For a more general picture of relationships between pragmatic<br />

functions, it is possible to generate a graph based on<br />

count<strong>in</strong>g the annotations <strong>for</strong> each turn <strong>in</strong> order to visualise<br />

the frequency with which a comment follows a question <strong>for</strong><br />

example. This is shown <strong>in</strong> matrix <strong>for</strong>m figure 4.<br />

Figure 4: Tag pattern<br />

Here we can see that a comment is most often followed by

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!