Proceedings Fonetik 2009 - Institutionen för lingvistik

More documents

Recommendations

Info

Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm Universityfinal year students at a high school in a smalltown in Northern Scania. The experiment conductorvisited the classes twice. The first time,the participants listened to the 2 minute dialogue(the original event). The only instructionthey got was that they had to listen to the dialogue.Nothing was said about that they shouldfocus on the voices or the linguistic content.The second time, one week later, they listenedto the six male voices (the foils), in randomizedorder for each listener group. Each voice wasplayed twice. The target voice was absent in thetest. The participants were told it was six malevoices in the lineup. This was also obviouswhen looking at the answer sheets. There weresix different listener groups for the 100 participantspresented in this paper. The number ofparticipants in each group differed betweenseven and 27 people.For each voice, the participants had to makea decision if the voice was the same as the onewho talked mostly in the dialogue last week.There were two choices on the answer sheet foreach voice; ‘I do recognize the voice’ or ‘I donot recognize the voice’. They were not told ifthe target voice was absent or not, nor werethey initially told if a voice would be playedmore than once. There was no training session.The participants were told that they could listento each voice twice before answering, but notrecommended to do that.Directly after their judgment of whether aspecific voice was the target or not, the participantsalso had to estimate the confidence intheir answer. The confidence judgment wasmade on scale ranging from 0% (explained as”absolutely sure this voice sample is not thetarget”) via 50% (“guessing”) to 100% (”absolutelysure this voice sample is the target”).Results and DiscussionSince this is an ongoing project the results presentedhere are only results from the first 100participants. We expect 300 participants at theend of this study.Figure 1 shows the number of the ‘I do recognizethe voice’ answers, or ‘yes’ answers.Since the voice lineups were target-absent lineups,a “yes” answer equals an incorrect answer,that is, an incorrect identification.When looking at the answers focusing onthe presentation order, which, as noted above,was randomized for each group, it is obviousthat there is a tendency not to choose the firstvoice. There were no training sessions and thatmight have had an influence. Each voice wasplayed twice, which means that the participantshad a reasonable amount of time to listen to it.The voices they heard in the middle of the testas well as the last played voice were chosenmost often.Only 10 participants had no ‘yes’ answersat all, which is 10% of all listeners, that is,these participants had all answers correct. Theresult for confidence for these 10 participantsshowed that their average confidence level was76 %, which can be compared with the averageconfidence level for the remaining 90 participants,69 %. No listener had more than 4 ‘yes’answers, which means that no one answered‘yes’ all over without exception.Number50454035302520151050Voices in presentation ordervoice 1 voice 2 voice 3 voice 4 voice 5 voice 6Figure 1. Numbers of ‘yes’ answers (e.g., errors)focusing on the presentation order.In Figure 2 the results are shown focusingon each foil 1-6. Most of the participants selectedfoil 4 as the target speaker. The same resultsare shown in Table 2. Foil 4 is closest tothe target speaker and they share more than onefeature in voice and speech. The speaking stylewith the forced, almost stuttering voice andhesitation sounds is obvious and might remindthe listeners of the target speaker when planningthe burglary. Foil 3 and 6 were chosenleast often, and these foils are different fromthe target speaker both concerning pitch, articulationrate and overall speaking style. It is notsurprising that there were almost no differencein results between foil 1 and 2. These malespeakers have very similar voices and speech.They are also quite close to the target speakerin the auditory analysis (i.e. according to anexpert holistic judgment). Foil 5 received many‘yes’ answers as well. He reminds of the targetspeaker concerning articulation rate, but not asobviously as foil 4. He also has a higher pitchand a slightly different dialect compared to thetarget speaker.182
Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm UniversityThe results indicate that the participantswere confused about foil 4 and this is an expectedresult. The confusion might be explainedboth in the auditory and the acousticanalyses. The overall speaking style, the articulationrate as well as the pitch, of the targetspeaker and foil 4 are striking.The results show that the mean accuracy ofall the foils was 66.17 with a standard deviation(SD) of 47.35. The results for each of the foilsare shown in Table 2. These results were analyzedwith a one-way ANOVA and the outcomeshows that the difference in how often thesix foils were (correctly) rejected was significant,F (5, 594) = 12.69, p < .000. Further posthoc Tukey test revealedNumber6050403020100Foils 1 - 6foil 1 foil 2 foil 3 foil 4 foil 5 foil 6Figure 2. Numbers of ‘yes’ answers (e.g., errors)focusing on the foils 1-6.that participant rejected the foil 3 (M=85) andfoil 6 (M=87), significantly more often as comparedwith the other foils.We next look at the results for the confidencejudgments and their realism. When analyzingthe confidence values we first reversedthe confidence scale for all participants whogave a “no” answer to the identification question.This means that a participant who answered“no” to the identification question andthen gave “0 %” (“absolutely sure that thisvoice sample is not the target”) received a100% score in confidence when the scale wasreversed. Similarly, a participant who gave 10as a confidence value received 90 and a participantwho gave 70 received a confidence valueof 30 after the transformation, etc. 50 in confidenceremained 50 after the transformation. Inthis way the meaning of the confidence ratingscould be interpreted in the same way for allparticipants, irrespective of their answers to theidentification question.The mean confidence for all the foils was69.93 (SD = 25.00). Table 2 shows that therewas no great difference in the level of the confidencejudgments for the given identificationanswers for the respective foils. A one-wayANOVA showed no significant difference betweenthe foils with respect to their confidence(F = .313).Turning next to the over/underconfidencemeasure (O/U-confidence), the average O/Uconfidencecomputed over all the foils was 3.77(SD = 52.62), that is, a modest level of overconfidence.Table 2 shows the means and SDsfor O/U-confidence for each foil. It can benoted that the participants showed quite goodrealism with respect to their level ofover/underconfidence for item 5 and an especiallyhigh level of overconfidence for item 4.Moreover, the participants’ showed underconfidencefor items 3 and 6, that is, the sameitems showing the highest level of correctness.A one-way ANOVA showed that there wasa significant difference between the foils withrespect to the participants’ O/U-confidence, F(5, 394) = 9.47, p < .000. Further post hocTukey tests revealed that the confidence of theparticipants who rejected foil 3 and foil 6showed significantly lower over-/underconfidence as compared to the confidenceof participants for foil 1, foil 2 and foil 4.Table 2. Means (SDs) for accuracy (correctness),confidence, over-/underconfidence andslope for the six foilsFoil 1 Foil 2 Foil 3 Foil 4 Foil 5 Foil 6Accuracy57.00(49.76)57.00(49.76)85.00(35.89)48.00(50.21)63.00(48.52)87.00(33.79)Confidence69.90(22.18)70.00(24.82)70.20(28.92)70.40(21.36)67.49(23.81)71.70(28.85)Over- 12.9% 13.0% -14.8% 22.4% 4.4% -15.3%/underconfSlope -5.07 -2.04 18.27 0.43 1.88 12.57Table 2 also shows the means for the slopemeasure (ability to separate correct from incorrectanswers to the identification question bymeans of the confidence judgments) for the 6foils. The overall slope for all data was 2.21.That is, the participants on average showed avery poor ability to separate correct from incorrectanswers by means of their confidence judgments.However, it is of great interest to notethat the only items for which the participantsshowed a clear ability to separate correct fromincorrect answers was for the two foils (3 and183
Page 1 and 2:
Department of LinguisticsProceeding
Page 3 and 4:
Proceedings, FONETIK 2009, Dept. of
Page 5 and 6:
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Proceedings, FOETIK 2009, Dept. of
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
Page 69 and 70:
Page 71 and 72:
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100:
Page 101 and 102:
Page 103 and 104:
Page 105 and 106:
Page 107 and 108:
Page 109 and 110:
Page 111 and 112:
Page 113 and 114:
Page 115 and 116:
Page 117 and 118:
Page 119 and 120:
Page 121 and 122:
Page 123 and 124:
Page 125 and 126:
Page 127 and 128:
Page 129 and 130:
Page 131 and 132: Proceedings, FOETIK 2009, Dept. of
Page 137 and 138: Proceedings, FONETIK 2009, Dept. of
Page 181: Proceedings, FONETIK 2009, Dept. of
Page 227: Department of LinguisticsPhonetics
show all

Proceedings Fonetik 2009 - Institutionen för lingvistik

Create successful ePaper yourself

Delete template?

Save as template?