Connectionist Modeling of Experience-based Effects in Sentence ...

More documents

Recommendations

Info

4.4 Forgetting Effects the other hand it is possible that regularity does not have a relevant impact on empirical studies of Mandarin extraction preferences and the explanation is left to other factors. 4.4 Forgetting Effects 4.4.1 The model As presented in chapter 3, the forgetting effect in center-embedded structures was addressed in a connectionist study by Christiansen and Chater (1999). They trained an SRN on right-branching and center-embedding structures and then assessed the output node activations after seeing the sequence NNNVV. The activations showed a clear 2VP preference consistent with empirical data from English speakers. The artificial language that covered center-embedding abba and right-branching aabb dependency patterns is perfectly comparable to the simple English grammar of object and subject relative clauses used by MacDonald and Christiansen (2002). Thus, it should be possible to replicate the effect with the SRNs trained on the English grammar for the replication in section 4.2. In German RCs, however, no real right-branching occurs, given the embedded RC is always attached to its head noun. Hence, in the German grammar used in section 4.2 both ORC and SRC exhibit a center-embedding abba pattern. This fact could result in the SRN exposed to a German grammar being more trained on verb-final center-embedding structures than the English counterpart resulting in different predictions for an NNNVV sequence. Supposing that the difference in SRC realization in the corpora approximately reflects an essential word order regularity difference between German and English, the SRN predictions will shed light on the part that experience plays in the explanation for the forgetting effect. I extended the study by Christiansen and Chater (1999) to gain GPE values for both conditions on all regions after the missing verb. In order to achieve that, it was necessary to have a grammar that simulates the forgetting effect, hence allows NNNVV sequences to be complete. Thus, in the probability table for the drop-V2 testing corpus the column referring to the position of V2 was deleted. In consequence the testing probabilities were adequate to a ‘N1 N2 N3 V3 V1’ grammar with the first verb (V3) being bound to N1 by number agreement and the second verb (V1) to N3. This is equivalent to forgetting the prediction induced by N2. The GPE for the ungrammatical conditions was calculated against these drop-V2 probabilities. So, if the network is making grammatical predictions, the error values for V1 and subsequent regions should be higher in the drop-V2 condition. On N1 the SRN would predict a verb in number agreement with N2. Then the network would predict another verb, but the test grammar predicts the determiner. After this point the network’s predictions should be completely confused because the just observed sequence is inconsistent with any structural generalizations developed during training. If the networks predictions are not too locally dependent, the predictions should be wrong for the last word (direct object of the main clause), too. 73
Chapter 4 Two SRN Prediction Studies However, assuming the forgetting hypothesis the GPE values would look differently. The forgetting hypothesis would mean for the SRN that it is unable to make correct predictions based on long distant dependencies but bases its predictions on rather locally consistent sequences. For example after seeing V3 the network only predicts one more verb because the observation of N1 is too weakly encoded in the hidden representations to influence the predictions. Consequently, on V1 the error for the drop-V2 condition should be lower because in the grammatical condition V1 is the third verb which is inconsistent with the SRN’s predictions. The 2VP preference should continue on the post-V1 regions because a locally coherent context with two verbs is easier to handle than a context of three verbs. Vasishth et al. (2008) mentioned the potential factor of comma insertions that could serve as structural cues alerting the reader of a missing verb. However, empirically it is hardly possible to separate the comma effect from word order effects. Vasishth and colleagues indeed tested English readers on comma-containing stimuli; but since English readers are not trained on commas used in this way, they are unable to draw as much information from comma positions as German readers do. In order to test whether the commas in fact influence structural predictions, the following study tested SRNs trained on German and English corpora both with and without commas. 4.4.2 Simulation 3: English Model Parameters For the forgetting effect simulation of English without commas (simulation 3a) no new training was necessary. The SRNs trained on the English corpora were tested on the grammatical and the ungrammatical condition in their state after one, two, and three epochs. For simulation 3b the English grammars for the training and testing corpora were enriched with commas and the SRNs were trained and tested in the usual way. 3a: English without commas According to the equivalence of Christiansen and Chater’s training language and the English training grammar used here, the effects should be similar. In particular, the GPE values for the V1 and post-V1 regions should receive lower GPE values in the drop-V2 condition. (24) Example test sentences: a. the judge that the reporters that the senators understand praise attacked the senators . (no-drop) b. the judges that the reporters that the lawyer praised attacked the senators . (drop-V2) 74
Page 1 and 2:
Connectionist Modeling of Experienc
Page 3 and 4:
Acknowledgments I am grateful to Be
Page 5 and 6:
Contents 3.3.4 Summary . . . . . .
Page 7 and 8:
List of Tables 2.1 Languages with s
Page 9 and 10:
Chapter 1 Preliminaries and the ACT
Page 11 and 12:
Chapter 1 Preliminaries (1) a. The
Page 13 and 14:
1.3 Psycholinguistic Aspects Chapte
Page 15 and 16:
Chapter 1 Preliminaries referrentia
Page 17 and 18:
Chapter 1 Preliminaries empirically
Page 19 and 20:
Chapter 1 Preliminaries rial, in Le
Page 21 and 22:
Chapter 1 Preliminaries are limited
Page 23 and 24:
Chapter 1 Preliminaries predictor.
Page 25 and 26:
Chapter 2 Issues in Relative Clause
Page 27 and 28:
Chapter 2 Issues in Relative Clause
Page 29 and 30: Chapter 2 Issues in Relative Clause
Page 45 and 46: clear predictions for head-final RC
Page 51 and 52: Reading time [ms] 400 600 800 1000
Page 55 and 56: OUTPUT PUT PLAN Chapter 3 Connectio
Page 57 and 58: Chapter 3 Connectionist Modelling o
Page 63 and 64: German Word Order Chapter 3 Connect
Page 65 and 66: Length-Adjusted Reading Time (ms) C
Page 69 and 70: Chapter 4 Two SRN Prediction Studie
Page 71 and 72: 4.1.3 Training and Testing Chapter
Page 73 and 74: GPE 0.0 0.2 0.4 0.6 0.8 1.0 English
Page 75 and 76: (23) [V1 [N1 V2 de ORC ] N2 de SRC
Page 79: Chapter 4 Two SRN Prediction Studie
Page 85 and 86: (27) German with commas: a. SRC: S1
Page 87 and 88: 4.4.4 Discussion Chapter 4 Two SRN
Page 91 and 92: Bibliography M. H. Christiansen. Th
Page 93 and 94: Bibliography E. Gibson and J. Thoma
Page 95 and 96: Bibliography J. W. King and M. Kuta
Page 97 and 98: Bibliography F. Reali and M. H. Chr
Page 99 and 100: Appendix A Statistics SRC ORC regio
Page 101 and 102: Appendix B Grammars B.1 English (wr
Page 103 and 104: B.2 German {numREL, Nnom, RC RCpure
Page 105 and 106: B.3 Mandarin Rel : SRC (0.85) | ORC
show all

Connectionist Modeling of Experience-based Effects in Sentence ...

Create successful ePaper yourself

Delete template?

Save as template?