12.01.2015 Views

Download - Academy Publisher

Download - Academy Publisher

Download - Academy Publisher

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Theorem 4: (10), individually, as one situation :<br />

"" element contents.<br />

Theorem 5: (11), (12), (13), the three cases can be<br />

classified one situation: .<br />

Theorem 67: the cases of (15) and (10) are the same, as<br />

a judgement event: <br />

element content.<br />

Theorem 7: (16), (17), (18), the three cases as a<br />

judgement event class: .<br />

Theorem 8: (19) as a judgement event alonely: <br />

According to the 7 judgement events summarized in<br />

this article, and then expand the SAX API, then we can<br />

create the SCTP index.<br />

Through the analysis above ,the algorithm process of<br />

building CSTP as follows:<br />

Input: XML and encoding table<br />

Output: CSTP Compressed index<br />

Begin:<br />

Start with SAX //parse sequencely,<br />

new (Node3)// to build the first node;<br />

s = new (Node1);<br />

push (S3, s);<br />

if (xml root tag has property value)<br />

{<br />

p = new (Node3);<br />

s.attribute = p;<br />

P-> left = property value;<br />

}<br />

while (* PP) / / there is also property<br />

{<br />

q = new (Node3);<br />

p->right = q;<br />

p = q;<br />

q->left = property value;<br />

}<br />

p->right = NULL;<br />

while (xml document did not be finished)<br />

{<br />

Case 1: / / read the string form: <<br />

... ...<br />

Case 2: / / read the string form : <<br />

... ...<br />

Case 3: //the third case: contents<br />

... ...<br />

Case 4: / /,be similar to Case 3<br />

... ...<br />

Case 5: / / <br />

... ...<br />

Case 6: / / read the string form : <br />

<br />

... ...<br />

Case 7: / / read the string form: <br />

( must be Children)<br />

... ...<br />

}<br />

There is another way to build an index: build nodes<br />

using the contents of judgement event, rather than the<br />

previous one, then it’s different to categorize the 25<br />

original events. Of course, the specific implementation<br />

of the algorithm is also different from the process above,<br />

don’t give it’s detail here.<br />

Ⅲ. THE QUERY ALGORITHM BASED ON CSTP<br />

A. Tthe Analysis of The Process and Character of Query<br />

Query process of Xpath is similar to Xquery’s, but the<br />

latter’s is more complicated to achieve. Only discuss the<br />

XPath query at this. Analyse the query process and the<br />

character of it on the CSTP index before discussing the<br />

XPath query, as follows:<br />

For simple path query: I : t / t / t<br />

1 2<br />

L<br />

m<br />

.The t 1<br />

may<br />

be the root element node of tree-graph of CSTP, that may<br />

be not. If it is not the root element node, such as a third<br />

layer node in the tree-graph of CSTP, when users submit<br />

query expression, then the computer would begin to visit<br />

differrent chain on the second layer of the CSTP treegraph.<br />

If there is no t 1<br />

, then visit the the differrent chain<br />

constituted with the children of the second node on the<br />

second layer (the second node may be in the same name<br />

item). Continue to do until find t 1<br />

, we can see that the<br />

computer will compare many times. Suppose, there are N<br />

elements nodes on the second layer of the CSTP, and<br />

there are M elements nodes in the different chain, each<br />

node on the second layer has on average N 2 child nodes<br />

of different chain, then such a query’s time complexity is<br />

O( N1 + N2<br />

⋅ N ) ≈ O( N2<br />

⋅ N ) . To a mass of data, N is a<br />

rather large number. And if the expression t 1<br />

supplied by<br />

user does not exist, which will make computer visit<br />

through CSTP tree-graph. Time cost is obviously very<br />

high. Of course, if additional information such as start<br />

layer can be specified, and new a head node on each layer<br />

of CSTP tree-graph, so that can significantly reduce the<br />

query time, but this would obviously lead to an<br />

unnecessary number of additional work. Therefore,<br />

suppose, t<br />

1<br />

is on the first layer of , this path<br />

expression is called a full path expression. If the path<br />

expression submitted by the user is not full, we can predefined<br />

certain rules, then generate the full path<br />

expression automatically when the user query.<br />

B. Analysis of Algorithm and Efficiency Based on XPath<br />

Query<br />

According to the analysis above, we can query<br />

efficiently in the compressed structure. At the same time,<br />

in accordance with coding table generated by DTD ,we<br />

can design decompression algorithms at the process of<br />

query.<br />

Because of the complexity of XML itself and CSTP<br />

index, only give the qualitative analysis and comparison<br />

at this.<br />

Basing on the high repeatability of XML tag data this<br />

paper introduce TP relation to convert the original tree<br />

structure of XML into CSTP structure in this paper. And<br />

the introduction of the same name items and different<br />

56

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!