19.11.2012 Views

Best Practices for Speech Corpora in Linguistic Research Workshop ...

Best Practices for Speech Corpora in Linguistic Research Workshop ...

Best Practices for Speech Corpora in Linguistic Research Workshop ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Turns <strong>in</strong> <strong>in</strong>terdiscipl<strong>in</strong>ary scientific research meet<strong>in</strong>gs:<br />

Us<strong>in</strong>g ‘R’ <strong>for</strong> a simple and flexible tagg<strong>in</strong>g system<br />

Seongsook Choi, Keith Richards<br />

Centre <strong>for</strong> Applied L<strong>in</strong>guistics<br />

University of warwick<br />

S.Choi@warwick.ac.uk, K.Richards@warwick.ac.uk<br />

Abstract<br />

This paper presents our <strong>in</strong>itial step towards identify<strong>in</strong>g and mapp<strong>in</strong>g functions (of utterances/turns) and actions (a series of connected<br />

actions managed over the course of a sequence of turns) <strong>in</strong>herent <strong>in</strong> authentic spoken language data us<strong>in</strong>g a simple and flexible tagg<strong>in</strong>g<br />

system <strong>in</strong> R. Our ultimate goal is to capture the patterns of dynamic practices through which <strong>in</strong>teractants produce and understand talk<strong>in</strong>-<strong>in</strong>teraction<br />

both qualitatively and quantitatively. The procedure <strong>in</strong>volves annotat<strong>in</strong>g the transcripts with tags that blends elements of<br />

CA (conversation analysis) and DA (discourse analysis), which we can then analyse quantitatively. The paper addresses the challenge of<br />

develop<strong>in</strong>g and annotat<strong>in</strong>g a CA and DA <strong>in</strong>tegrated tagg<strong>in</strong>g system and demonstrates graphical representation of a quantitative analysis<br />

that can be derived from it.<br />

Keywords: conversation analysis, discourse analysis, R, turn-tak<strong>in</strong>g<br />

1. Introduction<br />

This paper presents our <strong>in</strong>itial attempt at mapp<strong>in</strong>g functions<br />

(of utterances/turns (Crookes, 1990)) and actions (a<br />

series of connected actions managed over the course of a<br />

sequence of turns) <strong>in</strong>herent <strong>in</strong> authentic spoken language<br />

data us<strong>in</strong>g a simple and flexible tagg<strong>in</strong>g system <strong>in</strong> R (R<br />

Development Core Team, 2011). Our ultimate goal is to<br />

capture the patterns of dynamic practices through which<br />

<strong>in</strong>teractants produce and understand talk-<strong>in</strong>-<strong>in</strong>teraction<br />

both qualitatively and quantitatively. We carry this out by<br />

annotat<strong>in</strong>g the transcripts with tags that blends elements of<br />

CA (conversation analysis) and DA (discourse analysis),<br />

which we can then analyse quantitatively.<br />

While DA allows the use of priori categories as analytical<br />

resources, this is explicitly rejected <strong>in</strong> CA methodology,<br />

which has led to the assumption that the two are irreconcilable,<br />

though CA f<strong>in</strong>d<strong>in</strong>gs have been used as the basis<br />

<strong>for</strong> subsequent quantitative analysis (e.g. Mangione-Smith,<br />

Stivers, Elliott, McDonald and Heritage, 2003). We seek<br />

a closer <strong>in</strong>tegration of the two, us<strong>in</strong>g a CA analysis of sequences<br />

of talk to reveal aspects of participant design which<br />

would rema<strong>in</strong> hidden <strong>in</strong> a DA approach, then identify<strong>in</strong>g<br />

discourse patterns with<strong>in</strong> these which can be mapped, us<strong>in</strong>g<br />

DA, across large data sets <strong>in</strong> order to reveal ways <strong>in</strong><br />

which relevant features of talk play out <strong>in</strong> different <strong>in</strong>teractional<br />

contexts. This paper focuses on issues of representation<br />

<strong>in</strong> the quantitative dimension of our work, focus<strong>in</strong>g<br />

specifically on turn-tak<strong>in</strong>g.<br />

2. The challenge of CA/DA synthesis<br />

The overarch<strong>in</strong>g aim of this project is to comb<strong>in</strong>e the<br />

analytic penetration of CA with the potential applications<br />

of DA to large databases of spoken <strong>in</strong>teraction. In this<br />

section we identify the challenges which this presents and<br />

the ways <strong>in</strong> which we <strong>in</strong>tend to address these.<br />

Although DA embraces a much wider analytical spectrum<br />

37<br />

than CA, their very different conceptual foundations<br />

make procedural synthesis <strong>in</strong>herently problematic. While<br />

ethnography and CA have been widely accepted as complementary<br />

approaches (e.g. Miller and Fox, 2004), as<br />

have ethnography and DA (e.g. Sarangi and Roberts, 2005)<br />

and, more broadly, ethnography and l<strong>in</strong>guistics (Rampton,<br />

Tust<strong>in</strong>g, Mayb<strong>in</strong>, Barwell, Creese and Lytra, 2004), the<br />

differences between CA and DA, at least <strong>in</strong> the <strong>for</strong>m that<br />

we draw on this project, were exposed <strong>in</strong> debates that began<br />

over a quarter of a century ago (e.g. Lev<strong>in</strong>son, 1983; Schegloff,<br />

2005; Van Rees, 1992). S<strong>in</strong>ce then each has followed<br />

its own l<strong>in</strong>es of development, CA develop<strong>in</strong>g <strong>in</strong> applied<br />

areas and connect<strong>in</strong>g with Membership Categorisation<br />

Analysis, DA engag<strong>in</strong>g with contextual aspects to the<br />

po<strong>in</strong>t where researchers would now claim that ‘at its<br />

heart DA rema<strong>in</strong>s an ethnographically grounded study of<br />

language <strong>in</strong> action’ (Sarangi and Roberts, 2005, p. 639).<br />

To our knowledge, no attempt has been made to br<strong>in</strong>g<br />

them together as part of the same analytic enterprise and<br />

our attempt to do so draws on <strong>for</strong>ms of DA which are<br />

susceptible to cod<strong>in</strong>g and quantitative analysis.<br />

The reason <strong>for</strong> this is clear from the follow<strong>in</strong>g comment on<br />

quantitative studies from a CA perspective (Heritage, 1995,<br />

p. 406):<br />

Quantitative studies have not, so far, matched<br />

the k<strong>in</strong>ds of compell<strong>in</strong>g evidence <strong>for</strong> the features<br />

and uses of conversation practices that have<br />

emerged from ‘case by case’ analysis of s<strong>in</strong>gular<br />

exhibits of <strong>in</strong>teractional conduct. It does not, at<br />

the present time, appear likely that they will do so<br />

<strong>in</strong> the future. For quantitative studies <strong>in</strong>exorably<br />

draw the analyst <strong>in</strong>to an ‘external’ view of the<br />

data of <strong>in</strong>teraction, dra<strong>in</strong><strong>in</strong>g away the conductevidenced<br />

local <strong>in</strong>telligibility of particular situated<br />

actions which is the ultimate source of security<br />

that the object under <strong>in</strong>vestigation is not a<br />

theoretical or statistical artefact.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!