Proceedings of the LFG 02 Conference National Technical - CSLI ...

More documents

Recommendations

Info

Parsing with PCFGs and Automatic F-Structure Annotation Aoife Cahill Mairéad McCarthy Josef van Genabith Andy Way Computer Applications, Dublin City University, Ireland acahill, mccarthy, josef, away @computing.dcu.ie Proceedings of the LFG02 Conference National Technical University of Athens, Athens Miriam Butt and Tracy Holloway King (Editors) 2002 CSLI Publications http://csli-publications.stanford.edu/ 76
Abstract The development of large coverage, rich unification- (constraint-) based grammar resources is very time consuming, expensive and requires lots of linguistic expertise. In this paper we report initial results on a new methodology that attempts to partially automate the development of substantial parts of large coverage, rich unification-(constraint-) based grammar resources. The method is based on a treebank resource (in our case Penn-II) and an automatic f-structure annotation algorithm that annotates treebank trees with proto-fstructure information. Based on these, we present two parsing architectures: in our pipeline architecture we firstextract a PCFG from the treebank following the method of [Charniak, 1993; Charniak, 1996], use the PCFG to parse new text, automatically annotate the resulting trees with our f-structure annotation algorithm and generate proto-f-structures. By contrast, in the integrated architecture we firstautomatically annotate the treebank trees with fstructure information and then extract an annotated PCFG (A-PCFG) from the treebank. We then use the A-PCFG to parse new text to generate proto-fstructures. Currently our best parsers achieve more than 81% f-score on the 2400 trees in section 23 of the Penn-II treebank and more than 60% f-score on gold-standard proto-f-structures for 105 randomly selected trees from section 23. 1 Introduction The development of large coverage, rich unification-(constraint-) based grammar resources is very time consuming, expensive and requires considerable linguistic expertise [Butt et al, 1999; Riezler et al, 2002]. In this paper we report initial results on a new methodology that attempts to partially automate the development of substantial parts of large coverage, rich unification-(constraint-) based grammar resources. A large number of researchers (cf. [Charniak, 1996; Collins, 1999; Hockenmaier and Steedman, 2002]) have developed parsing systems based on the Penn-II [Marcus et al, 1994] treebank resource but to the best of our knowledge, to date, none of them have attempted to semi-automatically derive large coverage, rich unification(constraint) grammar resources. Our method is based on the Penn-II treebank resource and an automatic Lexical- Functional Grammar [Kaplan and Bresnan, 1982; Bresnan, 2001; Dalrymple, 2001] f-structure annotation algorithm that annotates treebank trees with proto-f-structure information. Based on these, we present two parsing architectures: in our pipeline architecture we firstextract a PCFG from the treebank following the method of [Charniak, 1993; Charniak, 1996], use the PCFG to parse new text, automatically annotate 77
Page 1 and 2:
Proceedings of the LFG 02 Conferenc
Page 3 and 4:
Table of Contents Asudeh, Ash 1-18
Page 5 and 6:
The Syntax of Preverbal Particles a
Page 7 and 8:
2 The Problem: Five Conflicting Cla
Page 9 and 10:
(10) CP C ′ XP CP XP The outermos
Page 11 and 12:
(18) CP C ′ C ∅ IP CP IP I I
Page 13 and 14:
1-3, unless problematic assumptions
Page 15 and 16:
(4c) *After [IP last year [IP she r
Page 17 and 18:
The adjunction rule (36) does not b
Page 19 and 20:
(8c) chun isteacht duit [CP[CP nuai
Page 21 and 22:
work on non-projecting categories a
Page 23 and 24:
Coordination and Parallelism in Glu
Page 25 and 26:
Traditional Linear Duplication a, a
Page 27 and 28:
(8) fred : fe see : fe ⊸ je ⊸ s
Page 29 and 30: us add the following additional con
Page 31 and 32: 3.2 A General Semantic Schema for C
Page 33 and 34: approach also achieves correct resu
Page 35 and 36: In summary, to show that two deriva
Page 37 and 38: Extraction possible from initial (p
Page 39 and 40: 5.3.2 Across-the-Board Extraction T
Page 41 and 42: Asudeh, Ash (in progress). Resumpti
Page 43 and 44: Montague, Richard (1974). Formal ph
Page 45 and 46: Abstract VP-chaining in Oriya (an I
Page 47 and 48: (3) haata dho-i bhaata khaa-i mun s
Page 49 and 50: Figure 4. Schematic representation
Page 51 and 52: the medicine is applied. If overtly
Page 53 and 54: The control equation in (13) establ
Page 55 and 56: intervening object and thus ‘kept
Page 57 and 58: Lexicon L(1): khi-aa-ga-laa: V L(2)
Page 59 and 60: (20) VP -> VP V' α � *α SIT-IND
Page 61 and 62: CONSTRAINT SYMMETRY IN OPTIMALITY T
Page 63 and 64: ‘Juan hit that dog with a stick
Page 65 and 66: ‘With what did Juan hit the dog?
Page 67 and 68: The unexpectedly bad candidate is (
Page 69 and 70: Amharic (Semitic, Ethiopia) is a pr
Page 71 and 72: 31) Wïšša-w bä=Daniel bet atäg
Page 73 and 74: A functional PP must contain a head
Page 75 and 76: appear): A less well known example
Page 77 and 78: appear on the non-recursive side of
Page 79: Publications. Smith Stark, Thomas.
Page 83 and 84: subj : spec : det : pred : the head
Page 85 and 86: could annotate the CFG rules extrac
Page 87 and 88: strings in the WSJ section of the P
Page 89 and 90: Annotation coverage is measured in
Page 91 and 92: Our grammars are extracted from sec
Page 93 and 94: Penn-II training set is discarded f
Page 95 and 96: grammatical functions are more akin
Page 97 and 98: forms to semi-automatically develop
Page 99 and 100: Predicate Argument Structure. In: P
Page 101 and 102: Abstract PARTITIVE NOUN PHRASES IN
Page 103 and 104: possessive partitives in (7) and (8
Page 105 and 106: Extraction of constituents of dativ
Page 107 and 108: iológiát [[az egyik-en]NP __ ]NP
Page 109 and 110: Unlike post-verbal constituent orde
Page 111 and 112: uncertainty. Adopting this idea, I
Page 113 and 114: accounts for genitive possessor nou
Page 115 and 116: (38) Hungarian TOPIC/SCOPE/FOCUS pa
Page 117 and 118: The rule in (41) shows that the top
Page 119 and 120: Payne, John and Erika Chisarik. (20
Page 121 and 122: ��
Page 123 and 124: ��
Page 125 and 126: ��
Page 127 and 128: ��
Page 129 and 130: ��
Page 131 and 132:
��
Page 133 and 134:
��
Page 135 and 136:
GETARUN PARSER Rodolfo Delmonte Abs
Page 137 and 138:
vital aid in the actual development
Page 139 and 140:
levels of representations required
Page 141 and 142:
sentence starts with a verb we let
Page 143 and 144:
adjunct_cp negat pro=CLI, {Case=acc
Page 145 and 146:
4. Parsing Strategies and P
Page 147 and 148:
qualified. The following list of pr
Page 149 and 150:
structure rules, after the head has
Page 151 and 152:
subordinators, and to arguments of
Page 153 and 154:
Hobbs J.R., D.E.Appelt, J.Bear, M.T
Page 155 and 156:
SECTION II: ITALIAN TEXTS o La stor
Page 157 and 158:
Mario criticava luigi perché ha ro
Page 159 and 160:
1. Introduction Since the beginning
Page 161 and 162:
. ⎡FOCUS ⎤ ⎢PRED ‘wonder (
Page 163 and 164:
(12) *ng- oingerang a mlarngii a be
Page 165 and 166:
(17) a. Hine ha- simla še kaniti.
Page 167 and 168:
have resumptive pronouns. Under the
Page 169 and 170:
way to license the same f-structure
Page 171 and 172:
(33) a. Robin met the man that/who
Page 173 and 174:
licensed in the normal way by funct
Page 175 and 176:
description of the situation. Borer
Page 177 and 178:
http://cslipublications.stanford.ed
Page 179 and 180:
A (Discourse) Functional Analysis o
Page 181 and 182:
A multi-factorial LFG analysis of a
Page 183 and 184:
(9) CP (↑ SUBJ)= ↓ ↑=↓ NP C
Page 185 and 186:
Word Order Properties SGF coordinat
Page 187 and 188:
4.1 Asymmetric Analyses Heycock and
Page 189 and 190:
(25) (a) S 〈 der Jäger ging in d
Page 191 and 192:
In sum, the Subject Functor Lineari
Page 193 and 194:
first and second conjunct are disti
Page 195 and 196:
In the Glue Semantics approach (see
Page 197 and 198:
In the optimality-based model, mism
Page 199 and 200:
complementiser (COMPL) may be exten
Page 201 and 202:
BULGARIAN WORD ORDER AND THE ROLE O
Page 203 and 204:
Bulgarian word order and the role o
Page 205 and 206:
Page 207 and 208:
Page 209 and 210:
Page 211 and 212:
Page 213 and 214:
Page 215 and 216:
Page 217 and 218:
Page 219 and 220:
Page 221 and 222:
Page 223 and 224:
Page 225 and 226:
1 Introduction This paper deals wit
Page 227 and 228:
(4) To nifiko itan raf-to (*apo ton
Page 229 and 230:
That is, in example (6) the partici
Page 231 and 232:
(22) To spiti fenotan egataleli-men
Page 233 and 234:
state denoted by the adjective appe
Page 235 and 236:
in Modern Greek bears the intrinsic
Page 237 and 238:
In addition, the lexical representa
Page 239 and 240:
In this way, we capture the intuiti
Page 241 and 242:
(57) To fagito ine magire-meno/*mag
Page 243 and 244:
Corpus-based Learning in Stochastic
Page 245 and 246:
LFG02 - Kuhn: Corpus-based Learning
Page 247 and 248:
Page 249 and 250:
Page 251 and 252:
Page 253 and 254:
Page 255 and 256:
Page 257 and 258:
Page 259 and 260:
Page 261 and 262:
Page 263 and 264:
1. Introduction In this paper I wil
Page 265 and 266:
This strategy has been outlined by
Page 267 and 268:
The essence of the new approach is
Page 269 and 270:
3.2. Control relations in Hungarian
Page 271 and 272:
complements of the postposition ut
Page 273 and 274:
offered a more detailed discussion,
Page 275 and 276:
lexical form which argument of this
Page 277 and 278:
Berkeley. pp. 18. (On-line publicat
Page 279 and 280:
0. Abstract This paper attempts to
Page 281 and 282:
'understand', oppgi 'state', opplev
Page 283 and 284:
This property of raising to object
Page 285 and 286:
(32) ved/fra å svømme by/from to
Page 287 and 288:
(45) *Til å sove ser han ut to (PR
Page 289 and 290:
(57) Å gjøre det er blitt tenkt p
Page 291 and 292:
Second argument: The preposition do
Page 293 and 294:
In Lødrup (2002), I show that most
Page 295 and 296:
Teleman, Ulf, Staffan Hellberg and
Page 297 and 298:
Abstract Majority of Bantu language
Page 299 and 300:
No Object Agreement Bresnan and Mch
Page 301 and 302:
(7) a. N-ä-ï- `m-lyì-á k-èlyá
Page 303 and 304:
epresent 72% of the total of 312 se
Page 305 and 306:
We can characterize more concretely
Page 307 and 308:
(19) Markedness Subhierarchies a. *
Page 309 and 310:
4.1 Hierarchy of Morphological Feat
Page 311 and 312:
(30) Harmonic Alignment of (28) wit
Page 313 and 314:
topic-anaphoricity (as a bound pron
Page 315 and 316:
the present analysis highlights the
Page 317 and 318:
Grimshaw, Jane. 2001. Economy of st
Page 319 and 320:
CLITICS AND PHRASAL AFFIXATION IN C
Page 321 and 322:
(2) (a) ...wedi eu gweld nhw. ASP C
Page 323 and 324:
(7) IP (↑GF) = ↓ ↑ = ↓ NP I
Page 325 and 326:
(14) (a) Jovan je čitao knjigu. Jo
Page 327 and 328:
The Principle of Morphological Comp
Page 329 and 330:
3.2 Serbian Pronominal Clitics A se
Page 331 and 332:
Example (35) has the c-structure in
Page 333 and 334:
(43) (a) čovek-: N (↑PRED) = ‘
Page 335 and 336:
(52) (a) jučer: N (ADJ↑) (↑PRE
Page 337 and 338:
Case Marking and Subject Extraction
Page 339 and 340:
objects occur in VP or in I (“obj
Page 341 and 342:
Coordinated structures (14) Peter o
Page 343 and 344:
(20) Han hadde de trodd e ville kom
Page 345 and 346:
(28) a. * Peter forventer [I’ [I
Page 347 and 348:
One disadvantage of this approach i
Page 349 and 350:
6.2 Case Assignment and Empty Categ
Page 351 and 352:
Lets us consider the binding condit
Page 353 and 354:
claims that subject extraction is n
Page 355 and 356:
ham: N ( (SUBJ £ ) ¡ ((COMP SUBJ
Page 357 and 358:
[Vikner and Schwartz, 1996] Vikner,
Page 359 and 360:
. Mary-ka ttal-lul sakwa-lul mek-I-
Page 361 and 362:
In (4a), the object of the simple d
Page 363 and 364:
3. Morphological Causatives in Kore
Page 365 and 366:
. Mary-ka ttal-pakkey sakwa-ul an m
Page 367 and 368:
Firstly, objects in MC are asymmetr
Page 369 and 370:
The asymmetries of double accusativ
Page 371 and 372:
Abstract The paper describes a pres
Page 373 and 374:
construction - in good style. • I
Page 375 and 376:
Figure 4: Attributed adjectives (
Page 377 and 378:
Aspects of the syntax of psychologi
Page 379 and 380:
Despite the syntactic differences e
Page 381 and 382:
e. (exists independently of the eve
Page 383 and 384:
(10) Poco a poco, me enfadó que pe
Page 385 and 386:
to (3), we can thematically charact
Page 387 and 388:
we can derive the grammatical funct
Page 389 and 390:
‘This book interests him (John).
Page 391 and 392:
unaccusative verb, while interesar
Page 393 and 394:
Roegiest, E. (1999). Objet direct p
Page 395 and 396:
Abstract The aim of this paper is t
Page 397 and 398:
(6) Pierre lui a fait écrire une l
Page 399 and 400:
dative NPs that the examples (4) to
Page 401 and 402:
verb. In some languages, like Engli
Page 403 and 404:
(15) *J’ai convaincu de lui tél
Page 405 and 406:
VP � V + V + (NP) + (PP) (��
Page 407 and 408:
a number of signs (words or phrases
Page 409 and 410:
��
Page 411 and 412:
DAVIS, A. & P.-J. KOENIG (2000) Lin
Page 413 and 414:
1. Introduction Subsumption and Equ
Page 415 and 416:
an XCOMP attribute whose value is t
Page 417 and 418:
(8) a. S|VP Æ NP* (V’) ( S|VP )
Page 419 and 420:
inconsistency with erzählen. But i
Page 421 and 422:
We revise the fronted S|VP expansio
Page 423 and 424:
5. The interaction between PVPF and
Page 425 and 426:
(28) Ein Aussenseiter versuchte hie
Page 427 and 428:
. * Ein Aussenseiter gewonnen haben
Page 429 and 430:
The use of subsumption addresses fu
Page 431 and 432:
TIGER TRANSFER Utilizing LFG Parses
Page 433 and 434:
is an additional way of ensuring co
Page 435 and 436:
"tiger_4548: Hier herrscht Demokrat
Page 437 and 438:
y a multi-set of OT marks, thereby
Page 439 and 440:
sumably they cannot be fixed in pri
Page 441 and 442:
¢¡¤£¦¥¦§©¨�£¦§�¥
Page 443 and 444:
Macros and Templates The system all
Page 445 and 446:
Ambiguous Predicates A simple examp
Page 447 and 448:
HD−OUT TI−FORM herrscht, TI−M
Page 449 and 450:
(ignoring PP-attachment for the eva
Page 451:
Kameyama, Megumi, Ryo Ochitani, and
show all

Proceedings of the LFG 02 Conference National Technical - CSLI ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?