W10-09

Recommendations

Info

TEXTRUNNER SRL-IE P R F1 P R F1 Binary 51.9 27.2 35.7 64.4 85.9 73.7 N-ary 39.3 28.2 32.9 54.4 62.7 58.3 All 47.9 27.5 34.9 62.1 79.9 69.9 Time 6.3 minutes 52.1 hours Table 1: SRL-IE outperforms TEXTRUNNER in both recall and precision, but has over 2.5 orders of magnitude longer run time. IE took 52.1 hours – roughly 2.5 orders of magnitude longer. We ran our experiments on quad-core 2.8GHz processors with 4GB of memory. It is important to note that our results for TEX- TRUNNER are different from prior results (Banko, 20<strong>09</strong>). This is primarily due to a few operational criteria (such as focusing on proper nouns, filtering relatively infrequent extractions) identified in prior work that resulted in much higher precision, probably at significant cost of recall. 5.3 Comparison under Different Conditions Although SRL-IE has higher overall precision, there are some conditions under which TEXTRUN- NER has superior precision. We analyze the performance of these two systems along three key dimensions: system confidence, redundancy, and locality. System Confidence: TEXTRUNNER’s CRF-based extractor outputs a confidence score which can be varied to explore different points in the precisionrecall space. Figure 1(a) and Figure 2(a) report the results from ranking extractions by this confidence value. For both binary and n-ary extractions the confidence value improves TEXTRUNNER’s precision and for binary the high precision end has approximately the same precision as SRL-IE. Because of its use of an integer linear program, SRL-IE does not associate confidence values with extractions and is shown as a point in these figures. Redundancy: In this experiment we use the redundancy of extractions as a measure of confidence. Here redundancy is the number of times a relation has been extracted from unique sentences. We compute redundancy over normalized extractions, ignoring noun modifiers, adverbs, and verb inflection. Figure 1(b) and Figure 2(b) display the results for binary and n-ary extractions, ranked by redundancy. 56 We use a log scale on the x-axis since high redundancy extractions account for less than 1% of the recall. For binary extractions, redundancy improved TEXTRUNNER’s precision significantly, but at a dramatic loss in recall. TEXTRUNNER achieved 0.8 precision with 0.001 recall at redundancy of 10 and higher. For highly redundant information (common concepts, etc.) TEXTRUNNER has higher precision than SRL-IE and would be the algorithm of choice. In n-ary relations for TEXTRUNNER and in binary relations for SRL-IE, redundancy actually hurts precision. These extractions tend to be so specific that genuine redundancy is rare, and the highest frequency extractions are often systematic errors. For example, the most frequent SRL-IE extraction was . Locality: Our experiments with TEXTRUNNER led us to discover a new validation scheme for the extractions – locality. We observed that TEXTRUN- NER’s shallow features can identify relations more reliably when the arguments are closer to each other in the sentence. Figure 1(c) and Figure 2(c) report the results from ranking extractions by the number of tokens that separate the first and last arguments. We find a clear correlation between locality and precision of TEXTRUNNER, with precision 0.77 at recall 0.18 for TEXTRUNNER where the distance is 4 tokens or less for binary extractions. For n-ary relations, TEXTRUNNER can match SRL-IE’s precision of 0.54 at recall 0.13. SRL-IE remains largely unaffected by locality, probably due to the parsing used in SRL. 6 A TEXTRUNNER SRL-IE Hybrid We now present two hybrid systems that combine the strengths of TEXTRUNNER (fast processing time and high precision on a subset of sentences) with the strengths of SRL-IE (higher recall and better handling of long-range dependencies). This is set in a scenario where we have a limited budget on computational time and we need a high performance extractor that utilizes the available time efficiently. Our approach is to run TEXTRUNNER on all sentences, and then determine the order in which to process sentences with SRL-IE. We can increase precision by filtering out TEXTRUNNER extractions that are expected to have low precision.
Precision 0.0 0.4 0.8 TextRunner SRL−IE 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision 0.0 0.4 0.8 TextRunner SRL−IE 1e−04 1e−03 1e−02 Recall 1e−01 1e+00 Precision 0.0 0.4 0.8 TextRunner SRL−IE 0.0 0.2 0.4 0.6 0.8 1.0 Recall Figure 1: Ranking mechanisms for binary relations. (a) The confidence specified by the CRF improves TEXTRUN- NER’s precision. (b) For extractions with highest redundancy, TEXTRUNNER has higher precision than SRL-IE. Note the log scale for the x-axis. (c) Ranking by the distance between arguments gives a large boost to TEXTRUNNER’s precision. Precision 0.0 0.4 0.8 TextRunner SRL−IE 0.0 0.2 0.4 0.6 0.8 1.0 Recall Precision 0.0 0.4 0.8 TextRunner SRL−IE 1e−04 1e−03 1e−02 Recall 1e−01 1e+00 Precision 0.0 0.4 0.8 TextRunner SRL−IE 0.0 0.2 0.4 0.6 0.8 1.0 Recall Figure 2: Ranking mechanisms for n-ary relations. (a) Ranking by confidence gives a slight boost to TEXTRUNNER’s precision. (b) Redundancy helps SRL-IE, but not TEXTRUNNER. Note the log scale for the x-axis. (c) Ranking by distance between arguments raises precision for TEXTRUNNER and SRL-IE. A naive hybrid will run TEXTRUNNER over all the sentences and use the remaining time to run SRL-IE on a random subset of the sentences and take the union of all extractions. We refer to this version as RECALLHYBRID, since this does not lose any extractions, achieving highest possible recall. A second hybrid, which we call PRECHYBRID, focuses on increasing the precision and uses the filter policy and an intelligent order of sentences for extraction as described below. Filter Policy for TEXTRUNNER Extractions: The results from Figure 1 and Figure 2 show that TEX- TRUNNER’s precision is low when the CRF confidence in the extraction is low, when the redundancy of the extraction is low, and when the arguments are far apart. Thus, system confidence, redundancy, and locality form the key factors for our filter policy: if the confidence is less than 0.5 and the redundancy is less than 2 or the distance between the arguments in the sentence is greater than 5 (if the relation is binary) or 8 (if the relation is n-ary) discard this tuple. These thresholds were determined by a parameter search over a small dataset. 57 Order of Sentences for Extraction: An optimal ordering policy would apply SRL-IE first to the sentences where TEXTRUNNER has low precision and leave the sentences that seem malformed (e.g., incomplete sentences, two sentences spliced together) for last. As we have seen, the distance between the first and last argument is a good indicator for TEX- TRUNNER precision. Moreover, a confidence value of 0.0 by TEXTRUNNER’s CRF classifier is good evidence that the sentence may be malformed and is unlikely to contain a valid relation. We rank sentences S in the following way, with SRL-IE processing sentences from highest ranking to lowest: if CRF.confidence = 0.0 then S.rank = 0, else S.rank = average distance between pairs of arguments for all tuples extracted by TEXTRUNNER from S. While this ranking system orders sentences according to which sentence is likely to yield maximum new information, it misses the cost of computation. To account for computation time, we also estimate the amount of time SRL-IE will take to process each sentence using a linear model trained on the sentence length. We then choose the sentence
Page 1 and 2:
NAACL HLT 2010 First International
Page 3:
Introduction It has been a long ter
Page 7:
Table of Contents Machine Reading a
Page 10 and 11:
Sunday, June 6, 2010 (continued) Se
Page 12 and 13:
"coherent", based on criteria such
Page 14 and 15:
(SRL). While there are a number of
Page 16 and 17: epeat select a clause chain Cu of
Page 18 and 19: even if those words reflect somethi
Page 20 and 21: Building an end-to-end text reading
Page 22 and 23: found to be wrong or correct (by su
Page 24 and 25: Text Queue Parser 4 Summary Text Mi
Page 26 and 27: formalized procedure to attach elem
Page 28 and 29: Steve_Walsh:throw:pass Steve_Walsh:
Page 30 and 31: NVNPN 2 'person':'intercept':'pass'
Page 32 and 33: 6 Related Work To build the knowled
Page 34 and 35: Large Scale Relation Detection ∗
Page 36 and 37: 1000 900 800 700 600 500 400 300 20
Page 38 and 39: to run against a web-scale corpus a
Page 40 and 41: Relation Prec Rec F1 Tuples Seeds i
Page 42 and 43: of the pattern space for any given
Page 44 and 45: Mining Script-Like Structures from
Page 46 and 47: e1 [ enter, nsubj, {customer, John}
Page 48 and 49: To identify such pairs, the topic s
Page 50 and 51: ≈ 57% 2. More words (bold) were j
Page 52 and 53: References Sergey Brin and Lawrence
Page 54 and 55: strengthsandweaknessesofthesystemin
Page 56 and 57: Fine-grainedRSTparser CollapsedRSTp
Page 58 and 59: Inference accuracy 0.3 0.25 0.2 0.1
Page 60 and 61: wehaveintroducedinferenceprocedures
Page 62 and 63: Semantic Role Labeling for Open Inf
Page 64 and 65: inference. Pruning involves using a
Page 68 and 69: that maximizes information gain div
Page 70 and 71: References Eugene Agichtein and Lui
Page 72 and 73: 1. Patterns based on words vs. pred
Page 74 and 75: state of the art information extrac
Page 76 and 77: ence of an attack. For instance, th
Page 78 and 79: manually annotated examples, which
Page 80 and 81: Towards Learning Rules from Natural
Page 82 and 83: tions of the learner suit the obser
Page 84 and 85: Accuracy Accuracy (Aggressive−Nov
Page 86 and 87: Accuracy Accuracy 100 95 90 85 80 7
Page 88 and 89: Unsupervised techniques for discove
Page 90 and 91: to categorize them. The article on
Page 92 and 93: ized label) or the most specialized
Page 94 and 95: Similarity Function k (L=2) F (L=2)
Page 96 and 97: References Sören Auer, Christian B
Page 98 and 99: a wide spectrum of solutions to the
Page 100 and 101: tion from the Web corpus at large s
Page 102 and 103: TextRunner, Kylin, KOG, WOE, WPE).
Page 104 and 105: Doug Downey, Matthew Broadhead, and
Page 106 and 107: Analogical Dialogue Acts: Supportin
Page 108 and 109: Heat flows from one place to anothe
Page 110 and 111: Source Text Translation* QRG-CE Tex
Page 112 and 113: 1987). We simplified the syntax of
Page 114 and 115: Gentner, D. (1983). Structure-Mappi
Page 116 and 117:
sources can help in tasks like name
Page 118 and 119:
werset in the previous step and bas
Page 120 and 121:
we were able to construct a list of
Page 122 and 123:
found at threshold of 2. There were
Page 124 and 125:
Supporting rule-based representatio
Page 126 and 127:
the domain of the spatial argument
Page 128 and 129:
the time of the movement. This link
Page 130 and 131:
don’t seem to be plausible candid
Page 132 and 133:
PRISMATIC: Inducing Knowledge from
Page 134 and 135:
Figure 1: System Overview by a suit
Page 136 and 137:
ferent dimensions. Continuing with
Page 139:
Author Index Barbella, David, 96 Ba
show all

W10-09

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?