12.07.2015 Views

An efficient mechanism for Matching multiple patterns on XML Streams

An efficient mechanism for Matching multiple patterns on XML Streams

An efficient mechanism for Matching multiple patterns on XML Streams

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

SET_ATTRIBUTESTART[author]{SET(name,Foo Fighters)SET(titles,>c)}MATCH_2_ATTRIBUTESSET_CONTENTLIST_END[false]ATTRIBUTE{CHECK()}LIST_END[true]INIT_CONSTANTINIT_VARIABLESTARTMATCH_AUTHORMATCH_REG_ELEMENTSTART[fan*][=TOP+1]{SET(:=ranking)}START ENDMATCH_CONTENTCONTENT[true]MATCH_REG_ENDEND[=TOP]MATCH_MUSICMATCH_SINGLE_STAR{SET_CONST(c,10)}{SET_VAR(single,LIST)SET_VAR(album,LIST)SET_VAR(ranking,FLOAT)}END[=TOP]END[=TOP]START[music][=TOP+1]MATCHED[tAlbum][=TOP]END[=TOP]MATCHED[tSingle][=TOP]4.2.1 Parallel Executi<strong>on</strong> of TemplatesThe basic idea behind this is to combine several templates,this is a standard procedure in graph theory:by using the cross product a combined automat<strong>on</strong> maybe calculated (please refer to [8]). It is still inevitableto check c<strong>on</strong>diti<strong>on</strong>s <str<strong>on</strong>g>for</str<strong>on</strong>g> all transiti<strong>on</strong>s; the per<str<strong>on</strong>g>for</str<strong>on</strong>g>mancegain is solely the difference between the time <strong>on</strong>e needsto execute a single state transiti<strong>on</strong> or several <strong>on</strong>es.The downside to this approach is in the exp<strong>on</strong>entialincrease in states and transiti<strong>on</strong>s. For exampletake two EFSM S 1 and S 2 with p i number of states andq i number of transiti<strong>on</strong>s: a resulting EFSM <str<strong>on</strong>g>for</str<strong>on</strong>g> S 1 ×S 2would have p 1·p 2 states and at least p 1·q 2 +q 1·p 2 transiti<strong>on</strong>s.A combinati<strong>on</strong> of ten EFSM with ten stateseach (e.g. templates having three elements and twoc<strong>on</strong>tent predicates) would end up with 10 billi<strong>on</strong> (10 10 )states and far more transiti<strong>on</strong>s. As the authors of [9]point out, a possible soluti<strong>on</strong> to this dilemma is to uselazy c<strong>on</strong>structi<strong>on</strong> principles <str<strong>on</strong>g>for</str<strong>on</strong>g> the automata.4.2.2 Lazy Automata C<strong>on</strong>structi<strong>on</strong>Even if the hypothetical number of states of a combinedEFSM seems to grow to exorbitant numbers <strong>on</strong>lya very small porti<strong>on</strong> of these states would ever be used.If an automat<strong>on</strong> is build in a lazy fashi<strong>on</strong> by c<strong>on</strong>structingnew states at runtime <strong>on</strong>ly when they are needed,this number may be decrease dramatically. For examplewe combined twelve templates with altogether 350states using a lazy c<strong>on</strong>structi<strong>on</strong> principle and endedup with an EFSM of <strong>on</strong>ly 169 states <str<strong>on</strong>g>for</str<strong>on</strong>g> a specific datastream after further optimisati<strong>on</strong>s that reduced redundantor empty states and transiti<strong>on</strong>s.END[=TOP]MATCH_ALBUM_STAREND[=TOP]MATCH_AUTHOR_ENDMATCHED[tAlbum][=TOP]END[=TOP]ENDTEMPLATE_MATCH()Figure 2. Automat<strong>on</strong> <str<strong>on</strong>g>for</str<strong>on</strong>g> template fooMusicList4.2 Optimisati<strong>on</strong>With the interpreting approach in place, several newoptimisati<strong>on</strong>s have become possible. One of the mostimportant <strong>on</strong>es was the combinati<strong>on</strong> of several automatain <strong>on</strong>e larger EFSM.4.2.3 State and Transiti<strong>on</strong> Reducti<strong>on</strong>There are several ways to further reduce the numberof states and transiti<strong>on</strong>s when combining automata:Merging of ɛ transiti<strong>on</strong>s <str<strong>on</strong>g>An</str<strong>on</strong>g>y number of statesthat are c<strong>on</strong>nected solely by ɛ transiti<strong>on</strong>s aremerged to two states with a single c<strong>on</strong>necting ɛtransiti<strong>on</strong>.Removal of initialisati<strong>on</strong> states Initialisati<strong>on</strong>functi<strong>on</strong>ality does not need to be c<strong>on</strong>sidered bythe combinati<strong>on</strong> algorithm but may be executedseparately.Removal of default self-transiti<strong>on</strong>s When no inputevent selects a transiti<strong>on</strong> an EFSM needs tostay in its current state. This can be made defaultbehaviour and then does not need to be made explicitby a self-transiti<strong>on</strong>.Grouping transiti<strong>on</strong>s By grouping transiti<strong>on</strong>s witha comm<strong>on</strong> c<strong>on</strong>diti<strong>on</strong> (e.g. same nesting level or

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!