Vector Space Semantic Parsing: A Framework for Compositional ...

More documents

Recommendations

Info

HLBL HLBL SENNA 4lang original scaled HLBL original 1 0.925 0.422 0.856 HLBL scaled 0.925 1 0.390 0.772 SENNA 0.422 0.390 1 0.361 4lang 0.856 0.772 0.361 1 Table 5: Correlations between judgments based on different embeddings tionary. Using LDOCE is computationally advantageous in that the core vocabulary is guaranteed to be very small, but finding the eigenvectors for an 80k by 80k sparse matrix would also be within CPU reach. The main advantage of starting with a conceptual graph lies elsewhere, in the possibility of investigating the function application issue we started out with. The 4lang conceptual representation relies on a small number of basic elements, most of which correspond to what are called unary predicates in logic. We have argued elsewhere (Kornai, 2012) that meaning of linguistic expressions can be formalized using predicates with at most two arguments (there are no ditransitive or higher arity predicates on the semantic side). The x and y slots of binary elements such as x has y or x kill y, (Kornai and Makrai 2013) receive distinct labels called NOM and ACC in case grammar (Fillmore, 1977); 1 and 2 in relational grammar (Perlmutter, 1983); or AGENT and PATIENT in linking theory (Ostler, 1979). The label names themselves are irrelevant, what matters is that these elements are not part of the lexicon the same way as the words are, but rather constitute transformations of the vector space. Here we will use the binary predicate x has y to reformulate the puzzle we started out with, analyzing queen of England, king of Italy etc. in a compositional (additive) manner, but escaping the commutativity problem. For the sake of concreteness we use the traditional assumption that it is the king who possesses the realm and not the other way around, but what follows would apply just as well if the roles were reversed. What we are interested in is the asymmetry of expressions like Albert has England or Elena has Italy, in contrast to largely symmetric predicates. Albert marries Victoria will be true if and only if Victoria marries Albert is true, but from James has a martini it does not follow that ?A martini has James. While the fundamental approach of CVSM is quite correct in assuming that nouns (unaries) and verbs (binaries) can be mapped on the same space, we need two transformations T 1 and T 2 to regulate the linking of arguments. A form like James kills has James as agent, so we compute V (James)`T 1 V (kill), while kills James is obtained as V (James)`T 2 V (kill). The same two transforms can distinguish agent and patient relatives as in the man that killed James versus the man that James killed. Such forms are compositional, and in languages that have overt case markers, even ‘surface compositional’ (Hausser, 1984). All input and outputs are treated as vectors in the same space where the atomic lexical entries get mapped, but the applicative paradox we started out with goes away. As long as the transforms T 1 (n) and T 2 (m) take different values on kill, has, or any other binary, the meanings are kept separate. Acknowledgments Makrai did the work on antonym set testing, Nemeskey built the embedding, Kornai advised. We would like to thank Zsófia Tardos (BUTE) and the anonymous reviewers for useful comments. Work supported by OTKA grant #82333. References R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research (JMLR). Charles Fillmore. 1977. The case for case reopened. In P. Cole and J.M. Sadock, editors, Grammatical Relations, pages 59–82. Academic Press. Roland Hausser. 1984. Surface compositional grammar. Wilhelm Fink Verlag, München. J. Katz and Jerry A. Fodor. 1963. The structure of a semantic theory. Language, 39:170–210. András Kornai and Márton Makrai. 2013. A 4lang fogalmi szótár [the 4lang concept dictionary]. In A. Tanács and V. Vincze, editors, IX. Magyar Számitógépes Nyelvészeti Konferencia [Ninth Conference on Hungarian Computational Linguistics], pages 62–70. András Kornai. 2012. Eliminating ditransitives. In Ph. de Groote and M-J Nederhof, editors, Revised and Selected Papers from the 15th and 16th Formal Grammar Conferences, LNCS 7395, pages 243– 261. Springer. 62
Tom McArthur. 1998. Living Words: Language, Lexicography, and the Knowledge Revolution. Exeter Language and Lexicography Series. University of Exeter Press. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. to appear. Efficient estimation of word representations in vector space. In Y. Bengio, , and Y. Le- Cun, editors, Proc. ICLR 2013. George A. Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41. Andriy Mnih and Geoffrey E Hinton. 2009. A scalable hierarchical distributed language model. Advances in neural information processing systems, 21:1081– 1088. Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems, pages 849–856. MIT Press. Nicholas Ostler. 1979. Case-Linking: a Theory of Case and Verb Diathesis Applied to Classical Sanskrit. PhD thesis, MIT. David M. Perlmutter. 1983. Studies in Relational Grammar. University of Chicago Press. 63
Page 1 and 2:
ACL 2013 51st Annual Meeting of the
Page 3:
Introduction In recent years, there
Page 7:
Table of Contents Vector Space Sema
Page 10 and 11:
Poster session (continued) 12:30 Lu
Page 12 and 13:
Input: Log. Form: “red ball”
Page 14 and 15:
the NP/N λx.x red N/N λx.A red x
Page 16 and 17:
dicted and target distributions, an
Page 18 and 19:
typed objects lie in the same vecto
Page 20 and 21:
Peter D. Turney. 2006. Similarity o
Page 22 and 23: (1999), who suggested that the real
Page 24 and 25: mation. In addition, using a new se
Page 26 and 27: Figure 6: Positive rate, negative r
Page 28 and 29: References Rens Bod, Remko Scha, an
Page 30 and 31: A Structured Distributional Semanti
Page 32 and 33: Figure 1: Sample sentences & triple
Page 34 and 35: el as its three axes. We explore th
Page 36 and 37: as well as our framing of word comp
Page 38 and 39: References Marco Baroni and Roberto
Page 40 and 41: Letter N-Gram-based Input Encoding
Page 42 and 43: Hidden my house my house Visibl
Page 44 and 45: … m my y h se us Hidden Visible m
Page 46 and 47: line of the systems presented in Ta
Page 48 and 49: References Yoshua Bengio, Réjean D
Page 50 and 51: Transducing Sentences to Syntactic
Page 52 and 53: t s “We booked the flight” →
Page 54 and 55: D 3(s) = ❀ PRP + ❀ V + DT ❀ +
Page 56 and 57: Output Model λ = 0 λ = 0.2 λ = 0
Page 58 and 59: References James A. Anderson. 1973.
Page 60 and 61: General estimation and evaluation o
Page 62 and 63: Guevara (2010) and Zanzotto et al.
Page 64 and 65: ⃗p = A u ⃗v where A u is a matr
Page 66 and 67: stituents. For this reason, we must
Page 68 and 69: References Hervé Abdi and Lynne Wi
Page 70 and 71: GOOD VERTICAL safe out raise level
Page 74 and 75: Determining Compositionality of Wor
Page 76 and 77: ability of a word in the investigat
Page 78 and 79: terminer) wheel” and “reinvent
Page 80 and 81: ing LSA are depicted. Discussion As
Page 82 and 83: WSM Measure wAvg(of ρ) ρAN-VO-SV
Page 84 and 85: “Not not bad” is not “bad”:
Page 86 and 87: activation functions in recursive n
Page 88 and 89: logic experiment in Socher et al. (
Page 90 and 91: tinction without the need to move t
Page 92 and 93: T. K. Landauer and S. T. Dumais. 19
Page 94 and 95: This differs from the classical Ran
Page 96 and 97: Hungarian Algorithm First cosine si
Page 98 and 99: Features: MSRpar MSRvid SMTeuroparl
Page 100 and 101: Joseph Reisinger and Raymond J. Moo
Page 102 and 103: phrase meaning in vector space. •
Page 104 and 105: By Bayes’ rule, p(x|a, n; Θ) ∝
Page 106 and 107: 4.2 Learning from distributional re
Page 108 and 109: W = (w 1 , w 2 , · · · , w n ) a
Page 110 and 111: Aggregating Continuous Word Embeddi
Page 112 and 113: trieval, Sivic and Zisserman propos
Page 114 and 115: Another advantage is that the propo
Page 116 and 117: e 50 100 200 300 400 500 CLEF 4.0 6
Page 118 and 119: Y. Bengio, J. Louradour, R. Collobe
Page 120 and 121: Answer Extraction by Recursive Pars
Page 122 and 123:
Algorithm 1: Auto-encoders co-train
Page 124 and 125:
NP NP ) , ADJP , NP ( NP - NP Engli
Page 126 and 127:
System Short Answer Short Answer Sh
Page 128 and 129:
References Y. Bengio and R. Ducharm
Page 130 and 131:
problem of paragraphs, of tence, mo
Page 132 and 133:
m1 k 4 only on the length of s and
Page 134 and 135:
Dialogue Act Label Example Train (%
Page 136 and 137:
[Calhoun et al.2010] Sasha Calhoun,
show all

Vector Space Semantic Parsing: A Framework for Compositional ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?