W10-09

Recommendations

Info

manually annotated examples, which are expensive and time-consuming to create. Co-training circumvents this weakness by playing off two sufficiently different views of a data set to leverage large quantities of unlabeled data (along with a few examples of labeled data), in order to improve the performance of a learning algorithm (Mitchell and Blum, 1998). Co-training will offer our approach to simultaneously learn the patterns of expressing a relation and its arguments. Other researchers have also previously explored automatic pattern generation from unsupervised text, classically in (Riloff & Jones 1999). Ravichandran and Hovy (2002) reported experimental results for automatically generating surface patterns for relation identification; others have explored similar approaches (e.g. Agichtein & Gravano 2000 or Pantel & Pennacchiotti, 2006). More recently (Mitchell et al., 2009) has shown that for macro-reading, precision and recall can be improved by learning a large set of interconnected relations and concepts simultaneously. We depart from this work by learning patterns that use the structural features of text-graph patterns and our particular approach to pattern and pair scoring and selection. Most approaches to automatic pattern generation have focused on precision, e.g., Ravichandran and Hovy report results in the Text Retrieval Conference (TREC) Question Answering track, where extracting one instance of a relation can be sufficient, rather than detecting all instances. This study has also emphasized recall. Information about an entity may only be mentioned once, especially for rarely mentioned entities. A primary focus on precision allows one to ignore many instances that require co-reference or long-distance dependencies; one primary goal of our work is to measure system performance in exactly those areas. 9 Conclusion We have shown that bootstrapping approaches can be successfully applied to micro-reading tasks. Most prior work with this approach has focused on macro-reading, and thus emphasized precision. Clearly, the task becomes much more challenging when the system must detect every instance. Despite the challenge, with very limited human intervention, we achieved F-scores of >.65 on 6 of the 11 relations (average F on the relation set was .58). 68 We have also replicated an earlier preliminary result (Boschee, 2008) showing that for a microreading task, patterns that utilize semantic/syntactic information outperform patterns that make use of only surface strings. Our result covers a larger inventory of relation types and attempts to provide a more precise measure of recall than the earlier preliminary study. Analysis of our system’s output provides insights into challenges that such a system may face. One challenge for bootstrapping systems is that it is easy for the system to learn just a subset of relations. We observed this in both sibling where we learned the relation brother and for employed where we only learned patterns for leaders of an organization. For sibling human intervention allowed us to correct for this mistake. However for employed even with human intervention, our recall remains low. The difference between these two relations may be that for sibling there are only two sub-relations to learn, while there is a rich hierarchy of potential sub-relations under the general relation employed. The challenge is quite possibly exacerbated by the fact that the distribution of employment relations in the news is heavily biased towards top officials, but our recall test set intentionally does not reflect this skew. Another challenge for this approach is continuing to learn in successive iterations. As we saw in the figures in Section 7, for many relations performance at iteration 20 is not significantly greater than performance at iteration 5. Note that seeing improvements on the long tail of ways to express a relation may require a larger recall set than the test set used here. This is exemplified by the existence of the highly precise 2-predicate patterns which in some cases never fired in our recall test set. In future, we wish to address the subset problem and the problem of stalled improvements. Both could potentially be addressed by improved internal rescoring. For example, the system scoring could try to guarantee coverage over the whole seed-set thus promoting patterns with low recall, but high value for reflecting different information. A complementary set of improvements could explore improved uses of human intervention. Acknowledgments This work was supported, in part, by BBN under AFRL Contract FA8750-09-C-179.
References E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections. In Proceedings of the ACM Conference on Digital Libraries, pp. 85-94, 2000. A. Baron and M. Freedman, “Who is Who and What is What: Experiments in Cross Document Co- Reference”. Empirical Methods in Natural Language Processing. 2008. A. Blum and T. Mitchell. Combining Labeled and Unlabeled Data with Co-Training. In Proceedings of the 1998 Conference on Computational Learning Theory, July 1998. E. Boschee, V. Punyakanok, R. Weischedel. An Exploratory Study Towards ‘Machines that Learn to Read’. Proceedings of AAAI BICA Fall Symposium, November 2008. M. Collins and Y Singer. Unsupervised Models for Named Entity Classification. EMNLP/VLC. (1999). M Mintz, S Bills, R Snow, and D Jurafsky.. Distant supervision for relation extraction without labeled data. Proceedings of ACL-IJCNLP 200. 2009.. T. Mitchell, J. Betteridge, A. Carlson, E. Hruschka, and R. Wang. “Populating the Semantic Web by Macro- Reading Internet Text. Invited paper, Proceedings of the 8th International Semantic Web Conference (ISWC 2009). P. Pantel and M. Pennacchiotti. Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations. In Proceedings of Conference on Computational Linguistics / Association for Computational Linguistics (COLING/ACL-06). pp. 113-120. Sydney, Australia, 2006. L. Ramshaw , E. Boschee, S. Bratus, S. Miller, R. Stone, R. Weischedel, A. Zamanian, “Experiments in multi-modal automatic content extraction”, Proceedings of Human Technology Conference, March 2001. D. Ravichandran and E. Hovy. Learning surface text patterns for a question answering system. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL 2002), pages 41–47, Philadelphia, PA, 2002. E. Riloff. Automatically generating extraction patterns from untagged text. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 1044-1049, 1996. E. Rilof and Jones, R "Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping", Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99) , 1999, pp. 474- 479. 1999. R Snow, D Jurafsky, and A Y. Ng.. Learning syntactic patterns for automatic hypernym discovery . Proceedings of NIPS 17. 2005. 69
Page 1 and 2:
NAACL HLT 2010 First International
Page 3:
Introduction It has been a long ter
Page 7:
Table of Contents Machine Reading a
Page 10 and 11:
Sunday, June 6, 2010 (continued) Se
Page 12 and 13:
"coherent", based on criteria such
Page 14 and 15:
(SRL). While there are a number of
Page 16 and 17:
epeat select a clause chain Cu of
Page 18 and 19:
even if those words reflect somethi
Page 20 and 21:
Building an end-to-end text reading
Page 22 and 23:
found to be wrong or correct (by su
Page 24 and 25:
Text Queue Parser 4 Summary Text Mi
Page 26 and 27:
formalized procedure to attach elem
Page 28 and 29: Steve_Walsh:throw:pass Steve_Walsh:
Page 30 and 31: NVNPN 2 'person':'intercept':'pass'
Page 32 and 33: 6 Related Work To build the knowled
Page 34 and 35: Large Scale Relation Detection ∗
Page 36 and 37: 1000 900 800 700 600 500 400 300 20
Page 38 and 39: to run against a web-scale corpus a
Page 40 and 41: Relation Prec Rec F1 Tuples Seeds i
Page 42 and 43: of the pattern space for any given
Page 44 and 45: Mining Script-Like Structures from
Page 46 and 47: e1 [ enter, nsubj, {customer, John}
Page 48 and 49: To identify such pairs, the topic s
Page 50 and 51: ≈ 57% 2. More words (bold) were j
Page 52 and 53: References Sergey Brin and Lawrence
Page 54 and 55: strengthsandweaknessesofthesystemin
Page 56 and 57: Fine-grainedRSTparser CollapsedRSTp
Page 58 and 59: Inference accuracy 0.3 0.25 0.2 0.1
Page 60 and 61: wehaveintroducedinferenceprocedures
Page 62 and 63: Semantic Role Labeling for Open Inf
Page 64 and 65: inference. Pruning involves using a
Page 66 and 67: TEXTRUNNER SRL-IE P R F1 P R F1 Bin
Page 68 and 69: that maximizes information gain div
Page 70 and 71: References Eugene Agichtein and Lui
Page 72 and 73: 1. Patterns based on words vs. pred
Page 74 and 75: state of the art information extrac
Page 76 and 77: ence of an attack. For instance, th
Page 80 and 81: Towards Learning Rules from Natural
Page 82 and 83: tions of the learner suit the obser
Page 84 and 85: Accuracy Accuracy (Aggressive−Nov
Page 86 and 87: Accuracy Accuracy 100 95 90 85 80 7
Page 88 and 89: Unsupervised techniques for discove
Page 90 and 91: to categorize them. The article on
Page 92 and 93: ized label) or the most specialized
Page 94 and 95: Similarity Function k (L=2) F (L=2)
Page 96 and 97: References Sören Auer, Christian B
Page 98 and 99: a wide spectrum of solutions to the
Page 100 and 101: tion from the Web corpus at large s
Page 102 and 103: TextRunner, Kylin, KOG, WOE, WPE).
Page 104 and 105: Doug Downey, Matthew Broadhead, and
Page 106 and 107: Analogical Dialogue Acts: Supportin
Page 108 and 109: Heat flows from one place to anothe
Page 110 and 111: Source Text Translation* QRG-CE Tex
Page 112 and 113: 1987). We simplified the syntax of
Page 114 and 115: Gentner, D. (1983). Structure-Mappi
Page 116 and 117: sources can help in tasks like name
Page 118 and 119: werset in the previous step and bas
Page 120 and 121: we were able to construct a list of
Page 122 and 123: found at threshold of 2. There were
Page 124 and 125: Supporting rule-based representatio
Page 126 and 127: the domain of the spatial argument
Page 128 and 129:
the time of the movement. This link
Page 130 and 131:
don’t seem to be plausible candid
Page 132 and 133:
PRISMATIC: Inducing Knowledge from
Page 134 and 135:
Figure 1: System Overview by a suit
Page 136 and 137:
ferent dimensions. Continuing with
Page 139:
Author Index Barbella, David, 96 Ba
show all

W10-09

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?