Vocabulary use in the FCE Listening test - Cambridge English Exams

More documents

Recommendations

Info

4 | CAMBRIDGE ESOL : RESEARCH NOTES : ISSUE 32 / MAY 2008Figure 1: Extract from Costa RicaRight now there’s a big problem with deforestation in Costa Ricaand one of the things that we need to do is to provide educationand we have a great opportunity here. We’ve got an educationprogramme in place where we will bring students in, free ofcharge and tell them about er the canopy and why it should besaved…Figure 2: Extract from Cable Car… need to do to stop that is to provide education. We’ve got aprogramme in place where we will bring students in from all overthe world and tell them about the forest and they can see forthemselves why it should be saved.Creating word frequency listsWordlists were made by running the two corpora throughWordSmith Tools (Stubbs 1996). The resulting exam textswordlist and radio texts wordlist were compared topublished British National Corpus (BNC) lists for spokenand written language (Leech, Rayson, Wilson 2001).The wordlists were then used to make key word lists inWordSmith Tools, using the larger composite radio text as areference text. The KeyWord tool finds words which aresignificantly more frequent in one text than another. If aword is unusually infrequent in the smaller corpus (theexam texts here), it is said to be a negative key word andwill appear at the end of the list.Concordances were then run on selected words, so thattheir usage in both sets of texts could be studied in moredetail.Results: lexical densityAs can be seen from Figure 3, all texts ranged from 30% to44% lexical density. This is lower at the bottom and top ofthe range than Stubbs’ (1996) finding for spoken texts(34% to 58%). At the upper end, this difference can beaccounted for, as most of the texts studied here aredialogues and would not be expected to have particularlyhigh lexical densities. The presence of results which arelower than 34% could, however, suggest that the methodfor calculating lexical density used in this study createddifferent results from Stubbs’ method.None of the texts analysed, whether exam or radio texts,have a lexical density greater than 44%, even though someof them are monologues where there is no feedback. Thiswould suggest that all these radio texts are dialogic in someway, with speakers regarding the listeners as involved inthe interaction to some extent, even though there is nooption for actual feedback.It is hard to see a particular pattern when comparing theexam texts to the radio texts; some exam texts have higherlexical density than the corresponding radio texts (fivetexts) and some radio texts have higher lexical density thanexam texts (four texts). Overall though, the exam texts havea slightly higher lexical density. The average is 37.5% asopposed to 36.8% for the radio texts.An independent t-test for significance was carried outusing SPSS©. There was no significant difference foundbetween the conditions (t=.443, df= 16, p=.663, twotailed). This shows that the difference between the meanlexical densities of the exam texts and the radio texts is notsignificant to 95% probability. That is to say, it isreasonable to assume that the differences in mean areattributable to chance.What is noticeable, however, is the range of densities inthe texts. The difference between the highest and lowestdensities on the radio texts is 13.5%. On the exam textsthis difference is only 6.5%, so it seems there is a tendencyfor the radio texts to have more variation in lexical densityand the exam texts to conform to an average density.Results: word frequencyTable 2 shows the top 50 words in radio texts, exam textsand, for comparison, the BNC spoken corpus and BNCwritten corpus.Figure 3:Lexical density of radioand exam textsPercentage lexical density45.0040.0035.0030.00Exam textRadio text25.0020.00Urban WildlifeJanet EllisCosta RicaJames DysonParakeetsPatricia RoutledgeVictoria BeckhamBonesLara HartTexts©UCLES 2008 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
CAMBRIDGE ESOL : RESEARCH NOTES : ISSUE 32 / MAY 2008 | 5Table 2: Top 50 words in radio texts, exam texts, BNC spoken corpusand BNC written corpusRadio texts Exam texts BNC spoken BNC written1 the the the the2 and and I of3 a a you and4 I I and a5 you to it in6 to of a to7 it in ’s is8 of it to to9 that that of was10 ’s you that it11 was was n’t for12 in ’t in that13 ’t we we with14 but but is he15 we so do be16 is what they on17 on is er I18 they ’s was by19 for my yeah ’s20 she for have at21 have they what you22 so about he are23 very on that had24 erm as to his25 at have but not26 know at for this27 well when erm have28 with all be but29 what me on from30 think do this which31 as like know she32 do people well they33 this be so or34 there from oh an35 all really got were36 he this ’ve as37 about well not we38 er know are their39 be there if been40 not an with has41 had one no that42 ’ve with ’re will43 really had she would44 her if at her45 when don’ there there46 my think think n’t47 because did yes all48 yes very just can49 been are all if50 if can can whoIf we take a closer look at the top ten items, we can seethat all of the top ten words in all four corpora are functionwords. The same top ten words appear in the radio texts asin the BNC spoken corpus, although in a different order.These results indicate that the corpus of radio texts may beas representative of spoken English as the BNC, andsuggests that although we may consider radio programmesto be a specialised genre, they are not too narrowly definedor restricted in language use.The top four items in the exam texts list are the same andin the same order as the radio texts list; overall nine of thetop ten words are the same in both lists. The exceptions arein, which is at position 7 in the exams texts list and 12 inthe radio list, and ‘s which is at position 10 in the radio listand 18 in the exam texts list.Lexical itemsThere are two lexical words in the top fifty of the BNCspoken corpus know and think, whereas there are no lexicalwords in the top fifty of the BNC written corpus. There arefour lexical words in the top fifty in the radio texts, very,know, think and really. The fact that there are more lexicalwords here than in the BNC top fifty can be accounted for bythe smaller corpus size. Two of these, think and know, arewords which are used within discourse markers: I think, youknow. They are also used to vocalise mental processes,which was another feature of spoken language that Chafe(1982) noted. Really and very are words which have someoverlap in meaning so it is interesting that they both appearhigh up on the radio texts wordlist.All four of the lexical words in the top fifty in the radiotexts also occur in the top fifty in the exam texts althoughthe order is a little different. There are two other itemswhich occur in the top fifty exam texts but not in the topfifty radio texts: people and like.Filled pauses, interjections and discourse markersThese do not appear in the BNC written corpus as they are apurely spoken phenomenon. Accordingly, in the BNCspoken corpus: er, yeah, erm, well, so, oh, no and yesappeared. In the radio texts corpus yes, well, really, so,erm and er occurred. It is not possible to say from the listalone whether so, well and really are used as discoursemarkers or what part of speech they are, which could beinvestigated with concordances.It is interesting to note the absence of oh from the radiotexts top fifty. Leech et al. (2001) find that‘Most interjections (e.g. oh, ah, hello) are much more characteristicof everyday conversation than of more formal/public “taskoriented” speech. However, the voiced hesitation fillers er and ermand the discourse markers mhm and um prove to be morecharacteristic of formal/public speech. We recognise er, erm andum as common thought pauses in careful public speech. Mhm islikely to be a type of feedback in formal dialogues both indicatingunderstanding and inviting continuation. In conversation, peopleuse yeah and yes much more, and overwhelmingly prefer theinformal pronunciation yeah to yes. In formal speech, on the otherhand, yes is slightly preferred to yeah.’The absence of oh and the presence of er and erm in theradio texts suggest that they lie more in the area of formalor public speech than conversation. This is also backed upby the much greater use of yes than yeah in the radio textscorpus. In the exam texts corpus only so and well occurred,both of which also have uses other than as interjections ordiscourse markers. There are no non-lexical filled pauses inthe exam texts top fifty list. This shows that the exam textsare missing this element of natural speech.These results suggest that the radio texts corpus is tosome extent composed of more formal speech than the BNCspoken corpus. There are indications that radio interviewsare, as suspected, somewhere towards the literate end ofthe oral/literate continuum. However, they are stillrepresentative of spoken language and do not showsimilarities with written language. The exam texts seem tomirror the radio texts fairly well, although there is anoticeable absence of non-lexical filled pauses.©UCLES 2008 – The contents of this publication may not be reproduced without the written permission of the copyright holder.
Page 1 and 2: Research NotesOffprint from Researc
Page 3: CAMBRIDGE ESOL : RESEARCH NOTES : I
Page 7: CAMBRIDGE ESOL : RESEARCH NOTES : I

Vocabulary use in the FCE Listening test - Cambridge English Exams

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?