Automatic Mapping Clinical Notes to Medical - RMIT University

More documents

Recommendations

Info

$journal of latex class files, vol. 6, no. 1, january ... - RMIT University$

Towards Cognitive Optimisation of a Search Engine Interface Kenneth Treharne School of Informatics and Engineering Flinders University of South Australia PO Box 2100 Adelaide, 5001 treh0003@flinders.edu.au Abstract Search engine interfaces come in a range of variations from the familiar text-based approach to the more experimental graphical systems. It is rare however that psychological or human factors research is undertaken to properly evaluate or optimize the systems, and to the extent this has been done the results have tended to contradict some of the assumptions that have driven search engine design. Our research is focussed on a model in which at least 100 hits are selected from a corpus of documents based on a set of query words and displayed graphically. Matrix manipulation techniques in the SVD/LSA family are used to identify significant dimensions and display documents according to a subset of these dimensions. The research questions we are investigating in this context relate to the computational methods (how to rescale the data), the linguistic information (how to characterize a document), and the visual attributes (which linguistic dimensions to display using which attributes). 1 Introduction Any search engine must make two kinds of fundamental decisions: how to use keywords/query words and what documents to show and how. Similarly every search engine user must decide what query words to use and then be able to interact with and vet the displayed documents. Again every web page author or designer makes decisions about what keywords, headings and link descriptions to use David M W Powers School of Informatics and Engineering Flinders University of South Australia PO Box 2100 Adelaide, 5001 David.Powers@flinders.edu.au Darius Pfitzner School of Informatics and Engineering Flinders University of South Australia PO Box 2100 Adelaide, 5001 pfit0022@flinders.edu.au to describe documents or sections of documents. This paper presents one set of experiments targeted at the choice of keywords as descriptive query words (nWords 1 ) and a second set of experiments targeted at explaining the effectiveness of the graphical options available to display and interact with the search results. 2 Term relevance and Cognitive Load There is considerable literature on visualisation techniques and a number of experimental and deployed search engines that offer a visualisation interface e.g. kartoo.com and clusty.com. However, there is little research to establish the effectiveness of such techniques or to evaluate or optimise the interface. This is surprising given the many theories and studies that target memory and processing limitations and information channel capacity, including many that build on and extend the empirical results summarized in George Miller’s famous Magical Number Seven paper (1956). It seems likely that for any search task visualization to realize optimal use of channel capacity, it should permit users to draw on their powerful and innate ability of pattern recognition. Another important aspect that has never been properly evaluated relates to the question of “which words do people use to describe a document?” Techniques like TFIDF are used in an attempt to automatically weight words according to how important they are in characterizing a document, but their cognitive relevance remains unexplored. 1 nWords is available at http://dweb.infoeng.flinders.edu.au Proceedings of the 2006 Australasian Language Technology Workshop (ALTW2006), pages 158–159. 158
Our work will enhance the document search process by improving the quality of the data the user filters while reducing machine processing overheads and the time required to complete a search task. It will also provide fundamental insight into the way humans summarise and compress information. The primary objective of this part of our research is to quantify the number of words a broad spectrum of people use to describe different blocks of text and hence the number of dimensions needed later to present this information visually. Our secondary objective is to enhance our understanding of the approaches taking by users during completion of search tasks. A third objective is to understand the choices a user makes in selecting keywords or phrases to describe or search for a document. 2.1 nWords Results Results for the nWords experiments explaining use of terms as descriptors or queries using a web based scenario are presented in Table 1 (with standard deviations). It is not only instructive to compare the results across task conditions (with/without access to the text, with/without restricting terms to occur in text), but the difference between the sub conditions where subjects were asked for keywords that described the document (D) versus search query terms (Q) they would use. Descriptor & Query Term Usage Task1 Task2 Task3 Number of D's Used 3.79 ± 3.34 4.02 ± 2.39 4.98 ± 3.35 Total Distinct Stems Used in Ds 8.22 ± 7.40 7.27 ± 4.71 11.80 ± 10.63 Average Distinct Stems per D 2.40 ± 1.83 2.14 ± 1.64 3.02 ± 3.46 Distinct D Stems in Top 10 TFIDF 1.79 ± 1.62 1.85 ± 1.52 2.62 ± 1.91 Total Distinct Q Stems Used in Qs 3.28 ± 1.78 3.81 ± 1.99 3.85 ± 1.89 Distinct Q Stems in Top 10 TFIDF 0.98 ± 0.95 1.19 ± 1.02 1.38 ± 0.99 Q Stem / D Stem Intersections 1.83 ± 1.58 2.52 ± 1.69 2.70 ± 1.59 Table 1: Statistics for nWords survey tasks (± standard deviation). Descriptors (D), query terms (Q). Task 1 Access to text, Task 2 No access, Task 3 Term must be in text. From the results of Table 1 we can see that TFIDF is poor at ranking keywords and query words. For the full data pool, only 2.11 ± 1.74 of the top ten TFIDF terms are used in description which best describes a text, whilst only 1.19 ± 0.99 are used in the query building task. 3 Graphical Representations Little research on perceptual discrimination of dynamic encodings can be found, but a few empirical studies have investigated the human 159 visual system’s ability to detect movement. Motion is a visual feature that is processed preattentively. Motion cues can be detected easily and quickly. Overall dynamic encodings outperform static encodings for attracting attention. 3.1 Preliminary Results Our Miller inspired experiments using web based applets varied properties of icons either statically or dynamically as show in figure 1. In general our preliminary results indicate that dynamic icons yield slower performances in relation to static vs. dynamic variation of any attribute other than size (see error bars in figure 1). Intuition tells us that some aspect of the performance slow down is due to the fact that a dynamically encoded feature requires longer identification time since the encoding is time based. However, we also report that a high degree of volatility or variance was observed in all dynamic conditions. Significant differences were observed between static encodings and their dynamic equivalents in all cases except feature size (and hue - we do not report a flashing hue condition currently). 120 100 80 60 40 20 0 Static Size Dyn. Grow Static Hue Static Saturat. Dyn. Flash Saturat. Static Bright. Dyn. Flash Bright. Static Angle Dyn. Rotate Figure 1 Aggregated subject response times (seconds) across 9 static and dynamic conditions. An analysis over repeated sessions did not reveal any learning trend. However, we do note that significant outliers were regularly encountered at the beginning of all experiments indicating that first timers took a while to learn how to use the experiment software. Possible outliers in trials midway through the duration of some experiments may indicate that the subject was distracted during the search task. Experiment subjects. This is possible given that subjects administered the test in their own time.
Page 1:
Proceeding of the Australasian Lang
Page 4 and 5:
Program Chairs: Committees Lawrence
Page 6 and 7:
Towards the Evaluation of Referring
Page 8 and 9:
The cat ate the hat that I made NP/
Page 10 and 11:
Figure 2: Cells affected by adding
Page 12 and 13:
were used because the accuracy for
Page 14 and 15:
TIME ACC COVER FAIL RATE % secs % %
Page 16 and 17:
model. We explore different estimat
Page 18 and 19:
S.S. Source Sense Source S.S. Acc.
Page 20 and 21:
acy. The difference in correctly as
Page 22 and 23:
Experiments with Sentence Classific
Page 24 and 25:
datasets with skewed distribution t
Page 26 and 27:
words produced by the feature selec
Page 28 and 29:
Sentence Class NB DT SVM Apology 1.
Page 30 and 31:
Computational Semantics in the Natu
Page 32 and 33:
S[sem = ] -> NP[sem=?subj] VP[sem=?
Page 34 and 35:
Any string which cannot be decompos
Page 36 and 37:
satisfy(Formula,model(D,F),G,pos):n
Page 38 and 39:
Classifying Speech Acts using Verba
Page 40 and 41:
The principles are dichotomous —
Page 42 and 43:
ance. Several additional example ut
Page 44 and 45:
our results. In classifying only th
Page 46 and 47:
Word Relatives in Context for Word
Page 48 and 49:
create a collection of training exa
Page 50 and 51:
Query Sense The nave was rebuilt in
Page 52 and 53:
Algorithm Avg S2LS Avg S3LS Avg S2L
Page 54 and 55:
B. Snyder and M. Palmer. 2004. The
Page 56 and 57:
it is even used as a black box that
Page 58 and 59:
ecognition in QA, so we will not us
Page 60 and 61:
BPER ILOC IPER BLOC BLOC BDATE BLOC
Page 62 and 63:
of noise introduced by the addition
Page 64 and 65:
2 Existing annotated corpora Much o
Page 66 and 67:
Our FUSE|tel spectrum of HD|sta 738
Page 68 and 69:
N Word Correct Tagged 23 OH mol non
Page 70 and 71:
L ATEX typesetting information into
Page 72 and 73:
in English, neither the determiner
Page 74 and 75:
con 4 (Sagot et al., 2006) and Morp
Page 76 and 77:
Feature type Positions/description
Page 78 and 79:
Miriam Butt, Helge Dyvik, Tracy Hol
Page 80 and 81:
ently mostly solved by employing hu
Page 82 and 83:
Figure 1: Augmented SCT Lexicon Tok
Page 84 and 85:
medical terms can be composed by ad
Page 86 and 87:
References Aronson, A. R. (2001). E
Page 88 and 89:
technique to most question types, i
Page 90 and 91:
User Question Analyser QA System Se
Page 92 and 93:
Figure 2: Precision Figure 3: Cover
Page 94 and 95:
Analysis and Review. Springer-Verla
Page 96 and 97:
2 Background 2.1 Pronoun Reference
Page 98 and 99:
ous accounts of the effects of cohe
Page 100 and 101:
Table 1: Overall Results for Object
Page 102 and 103:
References Jennifer Arnold, Janet E
Page 104 and 105:
In order for appropriate documents
Page 106 and 107:
y Michael West. He found that his t
Page 108 and 109:
low-scoring documents are “Click
Page 110 and 111:
a) b) You must complete at least 9
Page 112 and 113:
uct). In this paper, we will only c
Page 114 and 115: indicate the categories of substruc
Page 116 and 117: Argument lowering Dependent geach
Page 118 and 119: who [u : np] WH(np, s, s/?gq) λP.
Page 120 and 121: algorithms, and report the performa
Page 122 and 123: 2.3 Coverage of the Human Data Out
Page 124 and 125: and minimal subsets of the data. Of
Page 126 and 127: output. Finally, and related to the
Page 128 and 129: identifies, to avoid overwhelming t
Page 130 and 131: y the student is first parsed, usin
Page 132 and 133: a given word sequence Sorig as a pe
Page 134 and 135: A problem arises with this scheme w
Page 136 and 137: Example 1: Original Headline: Europ
Page 138 and 139: |relations(sentence1) ∩ relations
Page 140 and 141: 2685 True False 1275 Bleu Dep Bleu
Page 142 and 143: Features Acc. C1-prec. C1-recall C1
Page 144 and 145: Given the lack of research in VSD u
Page 146 and 147: ased features: Type 1 Original sibl
Page 148 and 149: Preposition of the adjunct semantic
Page 150 and 151: Algorithm 1 The algorithm of combin
Page 152 and 153: References Timothy Baldwin and Fran
Page 154 and 155: dependencies in German with respect
Page 156 and 157: wij moeten dit probleem aanpakken w
Page 158 and 159: a brevity penalty of BLEU n = 1 n =
Page 160 and 161: plicitly matching the syntax of the
Page 162 and 163: Probabilities improve stress-predic
Page 166 and 167: Natural Language Processing and XML
Page 168 and 169: Extracting Patient Clinical Profile
show all

Automatic Mapping Clinical Notes to Medical - RMIT University

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?