Automatic Mapping Clinical Notes to Medical - RMIT University

More documents

Recommendations

Info

$journal of latex class files, vol. 6, no. 1, january ... - RMIT University$

our results. In classifying only the literal meaning of utterances, we have focused on a simpler task than classifying in-context meaning of utterances which some systems attempt. Our results do, however, compare favourably with previous dialogue act classification work, and clearly validate our hypothesis that VRM annotation can be learned. Previous dialogue act classification results include Webb et al. (2005) who reported peak accuracy of around 71% with a variety of n-gram, word position and utterance length information on the SWITCHBOARD corpus using the 42-act DAMSL-SWBD taxonomy. Earlier work by Stolcke et al. (2000) obtained similar results using a more sophisticated combination of hidden markov models and n-gram language models with the same taxonomy on the same corpus. Reithinger and Klesen (1997) report a tagging accuracy of 74.7% for a set of 18 dialogue acts over the much larger VERBMOBIL corpus (more than 223,000 utterances, compared with only 1368 utterances in our own corpus). VRM Instances Precision Recall D 395 0.905 0.848 E 391 0.808 0.872 A 73 0.701 0.644 C 21 0.533 0.762 Q 218 0.839 0.885 K 97 0.740 0.763 I 64 0.537 0.453 R 109 0.589 0.514 Table 7: Precision and recall for each VRM In performing an error analysis of our results, we see that classification accuracy for Interpretations and Reflections is lower than for other classes, as shown in Table 7. In particular, our confusion matrix shows a substantial number of transposed classifications between these two VRMs. Interestingly, Stiles makes note that these two VRMs are very similar, differing only on one principle (frame of reference), and that they are often difficult to distinguish in practice. Additionally, some Reflections repeat all or part of the other’s utterance, or finish the other’s previous sentence. It is impossible for our current classifier to detect such phenomena, since it looks at utterances in isolation, not in the context of a larger discourse. We plan to address this in future work. Our results also provide support for using both 38 linguistic and statistical features in classifying VRMs. In the cases where our feature set consists of only the linguistic features identified by Stiles, our results are substantially worse. Similarly, when only n-gram, word position and utterance length features are used, classifier performance also suffers. Table 6 shows that our best results are obtained when both types of features are included. Finally, another clear trend in the performance of our classifier is that the VRMs for which we have more utterance data are classified substantially more accurately. 8 Conclusion Supporting the hypothesis posed, our results suggest that classifying utterances using Verbal Response Modes is a plausible approach to computationally identifying literal meaning. This is a promising result that supports our intention to apply VRM classification as part of our longerterm aim to construct an application that exploits speech act connections across email messages. While difficult to compare directly, our classification accuracy of 79.75% is clearly competitive with previous speech and dialogue act classification work. This is particularly encouraging considering that utterances are currently being classified in isolation, without any regard for the discourse context in which they occur. In future work we plan to apply our classifier to email, exploiting features of email messages such as header information in the process. We also plan to incorporate discourse context features into our classification and to explore the classification of pragmatic utterance meanings. References Jan Alexandersson, Bianka Buschbeck-Wolf, Tsutomu Fujinami, Michael Kipp, Stephan Koch, Elisabeth Maier, Norbert Reithinger, Birte Schmitz, and Melanie Siegel. 1998. Dialogue acts in VERBMOBIL-2. Technical Report Verbmobil- Report 226, DFKI Saarbruecken, Universitt Stuttgart, Technische Universitt Berlin, Universitt des Saarlandes. 2nd Edition. Victoria Bellotti, Nicolas Ducheneaut, Mark Howard, and Ian Smith. 2003. Taking email to task: The design and evaluation of a task management centred email tool. In Computer Human Interaction Conference, CHI, Ft Lauderdale, Florida, USA, April 5-10.
Jennifer Chu-Carroll, 1998. Applying Machine Learning to Discourse Processing. Papers from the 1998 AAAI Spring Symposium., chapter A statistical model for discourse act recognition in dialogue interactions, pages 12–17. The AAAI Press, Menlo Park, California. William W Cohen, Vitor R Carvalho, and Tom M Mitchell. 2004. Learning to classify email into ”speech acts”. In Dekang Lin and Dekai Wu, editors, Conference on Empirical Methods in Natural Language Processing, pages 309–316, Barcelona, Spain. Association for Computational Linguistics. Mark Core and James Allen. 1997. Coding dialogs with the DAMSL annotation scheme. In AAAI Fall Symposium on Communicative Action in Humans and Machines, pages 28–35, Cambridge, MA, November. Mark Core. 1998. Predicting DAMSL utterance tags. In Proceedings of the AAAI-98 Spring Symposium on Applying Machine Learning to Discourse Processing. Simon H Corston-Oliver, Eric Ringger, Michael Gamon, and Richard Campbell. 2004. Task-focused summarization of email. In ACL-04 Workshop: Text Summarization Branches Out, July. John J Godfrey, Edward C Holliman, and Jane Mc- Daniel. 1992. SWITCHBOARD: Telephone speech corpus for research and development. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, volume 1, pages 517–520, San Francisco, CA, March. Susan L Kline, C L Hennen, and K M Farrell. 1990. Cognitive complexity and verbal response mode use in discussion. Communication Quarterly, 38:350– 360. Anton Leuski. 2004. Email is a stage: discovering people roles from email archives. In Proceedings of Annual ACM Conference on Research and Development in Information Retrieval, Sheffield, UK. Karen J McGaughey and William B Stiles. 1983. Courtroom interrogation of rape victims: Verbal response mode use by attourneys and witnesses during direct examination vs. cross-examination. Journal of Applied Social Psychology, 13:78–87. Ludwein Meeuswesen, Cas Schaap, and Cees van der Staak. 1991. Verbal analysis of doctor-patient communication. Social Science and Medicine, 32(10):1143–50. Diana S Rak and Linda M McMullen. 1987. Sex-role stereotyping in television commercials: A verbal response mode and content analysis. Canadian Journal of Behavioural Science, 19:25–39. Norbert Reithinger and Martin Klesen. 1997. Dialogue act classification using language models. In Proceedings of Eurospeech ’97, pages 2235–2238, Rhodes, Greece. 39 John R Searle. 1969. Speech Acts : An Essay in the Philosophy of Language. Cambridge University Press. John R Searle. 1979. Expression and Meaning. Cambridge University Press. William B Stiles, Melinda L Au, Mary Ann Martello, and Julia A Perlmutter. 1983. American campaign oratory: Verbal response mode use by candidates in the 1980 american presidential primaries. Social Behaviour and Personality, 11:39–43. William B Stiles. 1992. Describing Talk: a taxonomy of verbal response modes. SAGE Series in Interpersonal Communication. SAGE Publications. ISBN: 0803944659. Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, Paul Taylor, Rachel Martin, Marie Meteer, and Carol Van Ess-Dykema. 2000. Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics, 26(3):339–371. Pasi Tapanainen and Timo Jarvinen. 1997. A nonprojective dependency parser. In Proceedings of the 5th Conference on Applied Natural Language Processing, pages 64–71, Washington D.C. Association for Computational Linguistics. Jan M Ulijn and Maurits J Verweij. 2000. Question behaviour in monocultural and intercultural business negotiations: the Dutch-Spanish connection. Discourse Studies, 2(1):141–172. Nick Webb, Mark Hepple, and Yorick Wilks. 2005. Dialogue act classification based on intra-utterance features. In Proceedings of the AAAI Workshop on Spoken Language Understanding. Steve Whittaker and Candace Sidner. 1996. Email overload: exploring personal information management of email. In ACM Computer Human Interaction conference, pages 276–283. ACM Press. Terry Winograd and Fernando Flores. 1986. Understanding Computers and Cognition. Ablex Publishing Corporation, Norwood, New Jersey, USA, 1st edition. ISBN: 0-89391-050-3. Susan Wiser and Marvin R Goldfried. 1996. Verbal interventions in significant psychodynamicinterpersonal and cognitive-behavioral therapy sessions. Psychotherapy Research, 6(4):309–319. Ian Witten and Eiba Frank. 2005. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, 2nd edition.
Page 1: Proceeding of the Australasian Lang
Page 4 and 5: Program Chairs: Committees Lawrence
Page 6 and 7: Towards the Evaluation of Referring
Page 8 and 9: The cat ate the hat that I made NP/
Page 10 and 11: Figure 2: Cells affected by adding
Page 12 and 13: were used because the accuracy for
Page 14 and 15: TIME ACC COVER FAIL RATE % secs % %
Page 16 and 17: model. We explore different estimat
Page 18 and 19: S.S. Source Sense Source S.S. Acc.
Page 20 and 21: acy. The difference in correctly as
Page 22 and 23: Experiments with Sentence Classific
Page 24 and 25: datasets with skewed distribution t
Page 26 and 27: words produced by the feature selec
Page 28 and 29: Sentence Class NB DT SVM Apology 1.
Page 30 and 31: Computational Semantics in the Natu
Page 32 and 33: S[sem = ] -> NP[sem=?subj] VP[sem=?
Page 34 and 35: Any string which cannot be decompos
Page 36 and 37: satisfy(Formula,model(D,F),G,pos):n
Page 38 and 39: Classifying Speech Acts using Verba
Page 40 and 41: The principles are dichotomous —
Page 42 and 43: ance. Several additional example ut
Page 46 and 47: Word Relatives in Context for Word
Page 48 and 49: create a collection of training exa
Page 50 and 51: Query Sense The nave was rebuilt in
Page 52 and 53: Algorithm Avg S2LS Avg S3LS Avg S2L
Page 54 and 55: B. Snyder and M. Palmer. 2004. The
Page 56 and 57: it is even used as a black box that
Page 58 and 59: ecognition in QA, so we will not us
Page 60 and 61: BPER ILOC IPER BLOC BLOC BDATE BLOC
Page 62 and 63: of noise introduced by the addition
Page 64 and 65: 2 Existing annotated corpora Much o
Page 66 and 67: Our FUSE|tel spectrum of HD|sta 738
Page 68 and 69: N Word Correct Tagged 23 OH mol non
Page 70 and 71: L ATEX typesetting information into
Page 72 and 73: in English, neither the determiner
Page 74 and 75: con 4 (Sagot et al., 2006) and Morp
Page 76 and 77: Feature type Positions/description
Page 78 and 79: Miriam Butt, Helge Dyvik, Tracy Hol
Page 80 and 81: ently mostly solved by employing hu
Page 82 and 83: Figure 1: Augmented SCT Lexicon Tok
Page 84 and 85: medical terms can be composed by ad
Page 86 and 87: References Aronson, A. R. (2001). E
Page 88 and 89: technique to most question types, i
Page 90 and 91: User Question Analyser QA System Se
Page 92 and 93: Figure 2: Precision Figure 3: Cover
Page 94 and 95:
Analysis and Review. Springer-Verla
Page 96 and 97:
2 Background 2.1 Pronoun Reference
Page 98 and 99:
ous accounts of the effects of cohe
Page 100 and 101:
Table 1: Overall Results for Object
Page 102 and 103:
References Jennifer Arnold, Janet E
Page 104 and 105:
In order for appropriate documents
Page 106 and 107:
y Michael West. He found that his t
Page 108 and 109:
low-scoring documents are “Click
Page 110 and 111:
a) b) You must complete at least 9
Page 112 and 113:
uct). In this paper, we will only c
Page 114 and 115:
indicate the categories of substruc
Page 116 and 117:
Argument lowering Dependent geach
Page 118 and 119:
who [u : np] WH(np, s, s/?gq) λP.
Page 120 and 121:
algorithms, and report the performa
Page 122 and 123:
2.3 Coverage of the Human Data Out
Page 124 and 125:
and minimal subsets of the data. Of
Page 126 and 127:
output. Finally, and related to the
Page 128 and 129:
identifies, to avoid overwhelming t
Page 130 and 131:
y the student is first parsed, usin
Page 132 and 133:
a given word sequence Sorig as a pe
Page 134 and 135:
A problem arises with this scheme w
Page 136 and 137:
Example 1: Original Headline: Europ
Page 138 and 139:
|relations(sentence1) ∩ relations
Page 140 and 141:
2685 True False 1275 Bleu Dep Bleu
Page 142 and 143:
Features Acc. C1-prec. C1-recall C1
Page 144 and 145:
Given the lack of research in VSD u
Page 146 and 147:
ased features: Type 1 Original sibl
Page 148 and 149:
Preposition of the adjunct semantic
Page 150 and 151:
Algorithm 1 The algorithm of combin
Page 152 and 153:
References Timothy Baldwin and Fran
Page 154 and 155:
dependencies in German with respect
Page 156 and 157:
wij moeten dit probleem aanpakken w
Page 158 and 159:
a brevity penalty of BLEU n = 1 n =
Page 160 and 161:
plicitly matching the syntax of the
Page 162 and 163:
Probabilities improve stress-predic
Page 164 and 165:
Towards Cognitive Optimisation of a
Page 166 and 167:
Natural Language Processing and XML
Page 168 and 169:
Extracting Patient Clinical Profile
show all

Automatic Mapping Clinical Notes to Medical - RMIT University

Create successful ePaper yourself

Delete template?

Save as template?