Vector Space Semantic Parsing: A Framework for Compositional ...

More documents

Recommendations

Info

terminer) wheel” and “reinvent wheel” in ukWaC being 623 and 27, respectively. We experimented with lemmas (noT) or with lemmas concatenated with their part of speech (POS) tags (yesT). We labeled the following strings in ukWaC as stopwords: low-frequency words (lemmas with frequency < 50), strings containing two adjacent non-letter characters (thus omitting sequences of various symbols), and closed-class words. For our experiments, we built WSMs using various parameters examined in previous works (see Section 2) and parameters which are implied from our own experience with WSMs. Figure 1 summarizes all the parameters we used for building WSMs. Measure settings We examined various Measure settings (see Section 3), summarized in Table 2. For all the vector comparisons, we used the cos similarity. Only for HAL we also examined euc and for COALS cor, since these are the recommended similarity functions for these particular WSMs (Lund and Burgess, 1996; Rohde et al., 2005). Met. par. possible values all sim. cos, euc if HAL, cor if COALS SU H 0,1,...,20,30,...,100 SU M 0,1,...,20,30,...,100 SU W no, log SU ∗ plus, mult EN x sim, –dist EN f avg, mOnly, hOnly, min, max CO ∗ ⊕, ⊗ NE N 10,20,...,50,100,200,...,500,1000 Table 2: All the parameters of Measures for determining semantic compositionality described in Section 3 used in our experiments. Experimental setup Following Biemann and Giesbrecht (2011), Reddy et al. (2011a), Krčmář et al. (2012), and Krčmář et al. (2013), we use the Spearman correlation (ρ) for the evaluation of all the combinations of WSMs and Measures (Setups). Since the distribution of scores assigned to Reddy’s NN dataset might not have corresponded to the distribution of DISCO scores, we decided not to map them to the same scale. Thus, we do not create a single list consisting of all the examined expressions. Instead, we order our Setups according to the weighted average of Spearman correlations calculated across all the expression types. The weights are directly proportional to the frequencies of the particular expression types. Thus, the Setup score (wAvg) is calculated as follows: wAvg = |AN|ρAN + |V O|ρV O + |SV |ρSV + |NN|ρNN . |AN| + |V O| + |SV | + |NN| Having the evaluation testbed, we tried to find the optimal parameter settings for all WSMs combined with all Measures with the help of TrValD. Then, we applied the found Setups to TestD. Notes Because several expressions or their constituents concatenated with their POS tags did not occur sufficiently often (for expressions: ≥ 0, for constituents: ≥ 50) in ukWaC, we removed them from the experiments; we removed “number crunching”, “pecking order”, and “sacred cow” from TrValD and “leading edge”, “broken link”, “spinning jenny”, and “sitting duck” from TestD. 5 Results The Setups achieving the highest wAvg when applied to TrValD are depicted in Table 3. The same Setups and their results when applied to TestD are depicted in Table 4. The values of Spearman correlations in TestD confirm many of the observations from TrValD 6 : Almost all the combinations of WSMs and Measures achieve correlation values which are statistically significant. This is best illustrated by the ρ(AN − V O − SV ) column in Table 4, where a lot of correlation values are statistically (p
Figure 1: All the parameters of WSMs described in Section 2 used in all our experiments. Semicolon denotes OR. All the examined combinations of parameters are implied from reading the diagram from left to right. certain Measures, although achieving high correlations upon certain expression types, fail to correlate with the rest of the expression types. Compare e.g. the correlation values of VSM and LSA combined with the SU Measure upon the AN and SV types with the correlation values upon the VO and NN types. The results, as expected, illustrate that employing more advanced alternatives of basic WSMs is more appropriate. Specifically, LSA outperforms VSM and COALS outperforms HAL in 21 and 23 correlation values out of 24, respectively. Concerning RI, the values of correlations seem to be close to the values of VSM and HAL. An interesting observation showing the appropriateness of using wAvg(ofρ) as a good evaluation score is supported by a comparison of the wAvg(ofρ) and ρ(AN−V O−SV ) columns. The columns suggest that some Setups might only be able to order the expressions of the same type and might not be able to order the expressions of different types among each other. As an example, compare the value of ρ = 0.42 in wAvg(ofρ) with ρ = 0.28 in ρ(AN−V O−SV ) in the row corresponding to COALS combined with SU. Consider also that all the values of correlations are higher or equal to the value in ρ(AN−V O−SV ). As for the parameters learned from applying all the combinations of differently set WSM algorithms and Measures to TrValD, their diversity is well illustrated in Tables 5 and 6. Due to this diversity, we cannot recommend any particular settings except for one. All our SU Measures benefit from weighting numbers of expression occurrences by logarithm. The correlation values in TestD are slightly lower – probably due to overfitting – than the ones observed in TrValD. HAL combined with the Measures using euc similarity was not as successful as when combined with cos. 7 For comparison, the results of Reddy et al. (2011b) and Chakraborty et al. (2011) as the results of the best performing Setups based on WSMs and association measures, respectively, applied to the DISCO data, are presented (Biemann and Giesbrecht, 2011). The correlation values of our Setups based on LSA and COALS, respectively, are mostly higher. However, the improvements are not statistically significant. Also, the recent results achieved by Krčmář et al. (2012) employing COALS and Krčmář et al. (2013) employ- 7 However, using HAL combined with euc, we observed significant negative correlations which deserve further exploration. 69
Page 1 and 2:
ACL 2013 51st Annual Meeting of the
Page 3:
Introduction In recent years, there
Page 7:
Table of Contents Vector Space Sema
Page 10 and 11:
Poster session (continued) 12:30 Lu
Page 12 and 13:
Input: Log. Form: “red ball”
Page 14 and 15:
the NP/N λx.x red N/N λx.A red x
Page 16 and 17:
dicted and target distributions, an
Page 18 and 19:
typed objects lie in the same vecto
Page 20 and 21:
Peter D. Turney. 2006. Similarity o
Page 22 and 23:
(1999), who suggested that the real
Page 24 and 25:
mation. In addition, using a new se
Page 26 and 27:
Figure 6: Positive rate, negative r
Page 28 and 29: References Rens Bod, Remko Scha, an
Page 30 and 31: A Structured Distributional Semanti
Page 32 and 33: Figure 1: Sample sentences & triple
Page 34 and 35: el as its three axes. We explore th
Page 36 and 37: as well as our framing of word comp
Page 38 and 39: References Marco Baroni and Roberto
Page 40 and 41: Letter N-Gram-based Input Encoding
Page 42 and 43: Hidden my house my house Visibl
Page 44 and 45: … m my y h se us Hidden Visible m
Page 46 and 47: line of the systems presented in Ta
Page 48 and 49: References Yoshua Bengio, Réjean D
Page 50 and 51: Transducing Sentences to Syntactic
Page 52 and 53: t s “We booked the flight” →
Page 54 and 55: D 3(s) = ❀ PRP + ❀ V + DT ❀ +
Page 56 and 57: Output Model λ = 0 λ = 0.2 λ = 0
Page 58 and 59: References James A. Anderson. 1973.
Page 60 and 61: General estimation and evaluation o
Page 62 and 63: Guevara (2010) and Zanzotto et al.
Page 64 and 65: ⃗p = A u ⃗v where A u is a matr
Page 66 and 67: stituents. For this reason, we must
Page 68 and 69: References Hervé Abdi and Lynne Wi
Page 70 and 71: GOOD VERTICAL safe out raise level
Page 72 and 73: HLBL HLBL SENNA 4lang original scal
Page 74 and 75: Determining Compositionality of Wor
Page 76 and 77: ability of a word in the investigat
Page 80 and 81: ing LSA are depicted. Discussion As
Page 82 and 83: WSM Measure wAvg(of ρ) ρAN-VO-SV
Page 84 and 85: “Not not bad” is not “bad”:
Page 86 and 87: activation functions in recursive n
Page 88 and 89: logic experiment in Socher et al. (
Page 90 and 91: tinction without the need to move t
Page 92 and 93: T. K. Landauer and S. T. Dumais. 19
Page 94 and 95: This differs from the classical Ran
Page 96 and 97: Hungarian Algorithm First cosine si
Page 98 and 99: Features: MSRpar MSRvid SMTeuroparl
Page 100 and 101: Joseph Reisinger and Raymond J. Moo
Page 102 and 103: phrase meaning in vector space. •
Page 104 and 105: By Bayes’ rule, p(x|a, n; Θ) ∝
Page 106 and 107: 4.2 Learning from distributional re
Page 108 and 109: W = (w 1 , w 2 , · · · , w n ) a
Page 110 and 111: Aggregating Continuous Word Embeddi
Page 112 and 113: trieval, Sivic and Zisserman propos
Page 114 and 115: Another advantage is that the propo
Page 116 and 117: e 50 100 200 300 400 500 CLEF 4.0 6
Page 118 and 119: Y. Bengio, J. Louradour, R. Collobe
Page 120 and 121: Answer Extraction by Recursive Pars
Page 122 and 123: Algorithm 1: Auto-encoders co-train
Page 124 and 125: NP NP ) , ADJP , NP ( NP - NP Engli
Page 126 and 127: System Short Answer Short Answer Sh
Page 128 and 129:
References Y. Bengio and R. Ducharm
Page 130 and 131:
problem of paragraphs, of tence, mo
Page 132 and 133:
m1 k 4 only on the length of s and
Page 134 and 135:
Dialogue Act Label Example Train (%
Page 136 and 137:
[Calhoun et al.2010] Sasha Calhoun,
show all

Vector Space Semantic Parsing: A Framework for Compositional ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?