13.07.2015 Views

gene pathway text mining and visualization - Artificial Intelligence ...

gene pathway text mining and visualization - Artificial Intelligence ...

gene pathway text mining and visualization - Artificial Intelligence ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Gene Pathway Text Mining <strong>and</strong> Visualization 531Figure 18-3 below shows an abbreviated notation for grammar rules,showing nine in total. The rule core is in bold <strong>and</strong> the rule con<strong>text</strong> isitalicized. Some of the rules listed have empty rule con<strong>text</strong> slots.INP VDP NP . transforms to>>INPIT VP NP CC NP : transforms to>>NPAFT NP transforms to>>WHENPAGNST NP transforms to>>AGPPAMG NP CC NP II transforms to>>AMGPARD NP , transforms to>>ARDPBE NP VDP NP transforms to>>NPBE RB JJ transforms to>>JJWI NP CC NP , transforms to>>MODFigure 18-3. Abbreviated notation for grammar rules3.1.4 Relation IdentificationThe top two levels of the parse chart are passed to the relationidentification step. Relations can be loosely compared to subject, verb,object constructs <strong>and</strong> are extracted using knowledge patterns. Knowledgepatterns refer to the different syntactic/semantic patterns used by authors toconvey knowledge. Like the parsing rules, knowledge patterns consist ofrule patterns with rule con<strong>text</strong> <strong>and</strong> rule core <strong>and</strong> transformations that areapplied only on the rule core. Different from the parsing rules, however, arethe actions that take place on the rule core. First, the rule core does not gettransformed into a single new tag, but rather each tag in the rule core isassigned a role, from a finite set of roles R. Currently, there are 10 differentroles defined in the set R. Roles 0 - 3 account for the more directly expressedknowledge patterns. Roles 4 - 9 identify nominalizations <strong>and</strong> relations inagentive form. Sentences may contain multiple overlapping knowledgepatterns, with tags playing multiple roles. The relation parsing output isshown in Figure 18-1 next to the boxed number 4. Conjunctions in therelations are split after the relation extraction step.Once potential relations are identified, each relation has to meet anumber of semantic constraints in order to be extracted. In the currentsystem, at least one word from the first argument <strong>and</strong> at least one word fromthe second argument had to exist in a <strong>gene</strong>/<strong>gene</strong> products lexicon, such asthose from the Gene Ontology <strong>and</strong> HUGO. In addition, relations werelimited by filtering predicates with 147 verb stems. At least one of the wordsin the predicate had to contain a verb stem. Examples of verb stems from thelexicon include activat, inhibit, increas, suppress, bind, catalyz, block,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!