Connectionist Modeling of Experience-based Effects in Sentence ...

More documents

Recommendations

Info

3.3 A Model of RC Processing Mean Activation 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Sing Nouns Indicates 2VP Preference Indicates 3VP Preference Erroneous Activation Plur Nouns Sing Verbs Plur Verbs Lexical Categories Figure 3.2: Ungrammaticality simulation by Christiansen and Chater (1999) (their figure 10). Mean output activation of lexical categories with error bars indicating standard error. In their view capacity is only a higher-level description of an SRN’s behavior. Changing aspects of the architecture or the training always affects both memory and processing. In contrast Just and Varma (2002) claim that noisy input to a network would be comparable to changing the capacity limit in symbolic models like CC-READER. They say that in the network the representational quality would be affected while the grammatical knowledge would stay constant. Whatever view may be correct, one can probably say that most effect on memory capacity has the mechanism of the temporal loop and the backpropagation procedure. Using different learning algorithms like for example backpropagation through time can increase the network’s memory span. 3.3 A Model of RC Processing 3.3.1 MacDonald and Christiansen (2002) MacDonald and Christiansen (2002) (MC02) presented a connectionist model covering individual and global differences in relative clause comprehension in English. Mapping the word-by-word prediction performance of an Elman network on reading times they showed an impressively accurate fit of the results of King and Just (1991). Showing this, MC02 directly attacked the importance of a discrete memory component for the subject/object difference and individual differences in comprehension of relative clauses. Since MC02’s model serves as the basis for the simulations in chapter 4, I will here describe 61 51 EOS
Chapter 3 Connectionist Modelling of Language Comprehension their experiment in detail. MC02 used a standard simple recurrent network (SRN) with a hidden and context layer of 60 units each. In- and output layers of 31 units each represented 30 words plus an end-of-sentence (EOS) symbol. The corpora each consisted of 10,000 English sentences constructed randomly from a simple artificial probabilistic context-free grammar (PCFG). Subject- or object-modifying relative clauses were contained in 5% of the sentences. Half were subject extracted and half were object extracted RCs. The rest of each corpus consisted of simple mono-clausal sentences. Verbs differed by transitivity and shared a number agreement with their subject nouns. Each corpus consisted of about 55,000 words. The sentence length was 3 to 27 words with a mean of 4.5. Notably relative clauses could be embedded recursively in each noun phrase. The RC attachment probability in the PCFG (0.05) limited the embedding depth. MC02 trained 10 networks with randomly distributed initial weights 1 , each on a different corpus. The learning rate was set to 0.1. The training phase covered only three epochs, each consisting of one corpus length. The networks learned to predict the next word in a sentence without being provided with any probabilistic information. The output unit activations were calculated by a cross-entropy algorithm which ensured that all activation values summed to one. In that way the networks’ output was comparable to continuation likelihoods assigned to each possible word. After training the networks were assessed on 10 sentences of all three types (SRC, ORC, and simple clause), respectively. For interpreting the network output in terms of processing difficulty MC02 calculated the so-called grammatical prediction error 2 (GPE). The GPE value is a measure for the network’s difficulty in making the correct predictions on each word. The measure was then used to map the relative word-by-word differences between the conditions on reading times from the study by King and Just (1991). Besides RC type MC02 used training epochs as a second factor. The network performances after one, two, and three epochs of training were compared to low-, mid-, and high-span readers’ reading speed. The results of MC02’s network simulation are shown in figure 3.3. Pooled over all three epochs the results show a clear subject preference on the main verb (praised) and the preceding region (embedded object in the SRC and embedded verb in the ORC). Furthermore the ORC performance shows significant improvement on the embedded and main verb through the three epochs of training. Notably, the SRC data does not show such an improvement. Rather the performance was relatively good from the start with no change during training. This indicates a clause type × exposure interaction. The same interaction (in this case clause type × reading span) is seen in King and Just’s empirical data (figure 2.1). Notably, the simple SRN model seems to make better predictions than the CC-READER model by Just and Carpenter (1992) since CC-READER captures the span effect but not the interaction with clause type (see figure 2.4). Importantly, MC02 call the mentioned interaction a F requency × Regularity interaction. Specifically, the regular nature of English SRCs with respect to word order (SVO) serves 1 Between -0.15 and 0.15. 2 See chapter 4 for a detailed description 52
Page 1 and 2:
Connectionist Modeling of Experienc
Page 3 and 4:
Acknowledgments I am grateful to Be
Page 5 and 6:
Contents 3.3.4 Summary . . . . . .
Page 7 and 8: List of Tables 2.1 Languages with s
Page 9 and 10: Chapter 1 Preliminaries and the ACT
Page 11 and 12: Chapter 1 Preliminaries (1) a. The
Page 13 and 14: 1.3 Psycholinguistic Aspects Chapte
Page 15 and 16: Chapter 1 Preliminaries referrentia
Page 17 and 18: Chapter 1 Preliminaries empirically
Page 19 and 20: Chapter 1 Preliminaries rial, in Le
Page 21 and 22: Chapter 1 Preliminaries are limited
Page 23 and 24: Chapter 1 Preliminaries predictor.
Page 25 and 26: Chapter 2 Issues in Relative Clause
Page 45 and 46: clear predictions for head-final RC
Page 51 and 52: Reading time [ms] 400 600 800 1000
Page 55 and 56: OUTPUT PUT PLAN Chapter 3 Connectio
Page 57: Chapter 3 Connectionist Modelling o
Page 61 and 62: Chapter 3 Connectionist Modelling o
Page 63 and 64: German Word Order Chapter 3 Connect
Page 65 and 66: Length-Adjusted Reading Time (ms) C
Page 67 and 68: Chapter 3 Connectionist Modelling o
Page 69 and 70: Chapter 4 Two SRN Prediction Studie
Page 71 and 72: 4.1.3 Training and Testing Chapter
Page 73 and 74: GPE 0.0 0.2 0.4 0.6 0.8 1.0 English
Page 75 and 76: (23) [V1 [N1 V2 de ORC ] N2 de SRC
Page 85 and 86: (27) German with commas: a. SRC: S1
Page 87 and 88: 4.4.4 Discussion Chapter 4 Two SRN
Page 91 and 92: Bibliography M. H. Christiansen. Th
Page 93 and 94: Bibliography E. Gibson and J. Thoma
Page 95 and 96: Bibliography J. W. King and M. Kuta
Page 97 and 98: Bibliography F. Reali and M. H. Chr
Page 99 and 100: Appendix A Statistics SRC ORC regio
Page 101 and 102: Appendix B Grammars B.1 English (wr
Page 103 and 104: B.2 German {numREL, Nnom, RC RCpure
Page 105 and 106: B.3 Mandarin Rel : SRC (0.85) | ORC
show all

Connectionist Modeling of Experience-based Effects in Sentence ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?